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Abstract 


The objective of this class is to experience an introduction to the rich, complex, 
and powerful subject of Ordinary Differential Equations (ODEs). Specifically: 


(1) 


(2) 


(3) 
(4) 


Develop a working familiarity with linear algebra to the extent we need it 
for the differential equations we shall consider. Linear algebra serves us 
as a very robust backend for handling all higher-dimensional linear issues 
which will arise. 

Learn how to solve a reasonably large class of differential equations. Most 
differential equations cannot be solved (the solutions can only be approx- 
imated with computers, which is a story for a different math class), but 
we will teach you many of the differential equations for which we can find 
exact solutions. 

Observe and investigate real-world applications which are governed by 
differential equations. 

Study qualitative properties of both the differential equations we can solve 
and those we cannot. 


The textbook for the course is Differential Equations Second Edition, by John 


Polking, 


Albert Boggess, and David Arnold [2]. These notes are based on this 


textbook, except for the sake of time we only include a select curated portion of the 
textbook material in these notes. Any and all comments, typos, errors, questions, 
suggestions are enthusiastically welcome! 


Last revised June 1, 2020. 
2010 Mathematics Subject Classification. Primary . 
The first author is supported by the National Science Foundation under Award No. 1703709. 
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Introduction 


The prerequisite for this course is Math31B: Integration and Infinite Series. Conse- 
quently, we will assume you have a working familiarity with the basic properties of 
differentiation and integration of common elementary functions (although we will 
review the tools which are most relevant for us). In this class we will put these 
existing tools to work to help us solve so-called differential equations. We begin 
with a simple example of a differential equation: 


Question 0.0.1. Find a differentiable function y : R > R which satisfies the 
following: 

(1) y'(t) = exp(t) for allt © R, and 

(2) y(0) = 10. 
ANSWER. From (1) we know that the function y(t) must be of the form y(t) 


exp(t) + C for some fixed C € R. By (2) we know that y(0) = exp(0) + C 
1+ C= 10. Thus C = 9 and so y(t) = exp(t) + 9. 


Question illustrates a paradigm for differential equations in general. Namely, 
we will often be given the following information: 


(1) Information about an unknown function y’s derivative (or second deriva- 
tive, etc.), for instance, saying “y’(t) = exp(t)” 
(2) Information about specific function values of y (or y’, y”, etc.), for in- 
stance, saying “y(0) = 10”. 
Then the game will then be to use this information to determine the unknown 
function y as specifically as we can. Before we go any further, we make the following 
declaration: 


You will not be able to solve most differential equations. 


This is by no means a commentary on anyone’s mathematical abilities, we simply 
want to bring you up to speed with a cold hard fact of life: most differential equa- 
tions are impossible (for anyone) to solve exactly. However, we will study in detail 
many simple differential equations which we can solve exactly. Fortunately, the dif- 
ferential equations we will study also have many practical real-world applications. 


What about the non-solvable differential equations? Not all hope is lost in this case. 
Indeed, for practical real-world applications you generally only need a sufficiently 
accurate approximation of a solution. Luckily this is something that computers are 
very good at and this is a very active area of applied mathematics. We will not 
go down this rabbit-hole in this class, but it helps to be aware of this remedy so 
you are not too discouraged if and when you encounter an impossible differential 
equation. 
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Algebraic equations 


In this section we will review the state of affairs for one-variable algebraic equations. 
Recall that a one-variable algebraic equation is an equation of the form: 


p(X) = 0, 


where p is a polynomial and X is a variable. A solution to this equation is a 
specific real number 2 € R which has the property that p(x) = 0 (i.e., when we 
plug in the number x into p, it evaluates to the number 0). 


We also hope to make a general point in this section: that even for algebraic 
equations (i.e., a differential equation with no derivatives), things become very 
complicated and eventually impossible very quickly. 


Linear equations. A linear equation (in one variable) is an equation of the 
form: 
a1X +a9 = 0 (where aj,a9 € R) 
If ay £0, then this has exactly one solution, namely: 
ao 
Lis -—. 
ay 
If a, = 0, then this has either zero solutions (for instance, if ag 4 0), or infinitely 
many solutions (for instance, if a9 = 0 then every x € R is a solution). These 
observations foreshadow various features of systems of linear equations in multiple 
variables which we will study in Chapter [I] 


Quadratic equations. A quadratic equation is an equation of the form: 
agX*+a,X +a) = (where ag, a1, a9 € R) 


If ag £0, then the quadratic formula yields solutions: 


—a, + \/a? — 4aza9 —a) — \/a? — 4aza9 
and 2: 


Ly i= 


2a2 2a2 
Recall that three things can happen depending on the sign of the discriminant 
a? — 4azao: 
(Case 1) If a? — 4agaq > 0, then x, 4 x2 are two real solutions. 
(Case 2) If a? — 4agag = 0, then 2, = 22 is a single real solution (of multiplicity 
two). 
(Case 3) If a? — 4aga9 < 0, then x; # x2 are two distinct solutions, however, they 
will be complex solutions and not real solutions. 
You are expected to be able to use the quadratic formula to solve quadratic equa- 
tions in this class. 


Cubic equations. A cubic equation is an equation of the form: 
agX° +a.X* +a,X tan = 0 (where a3, a2,a1,@0 € R) 


You were probably never taught the formula for the cubic equation in school. This 
is for good reason: it’s complicated! You do not need it for this class either, but in 
case you are curious, here it is: if ag 4 0, then the three solutions are 

Ao 


1 k 
Zk = - 72 (m+eto+ 35), for k = 0,1,2 
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where 
Pes as 
Ao := a3 — 3a3a1 
Ay := 2a3 — 9azaza, + 27aza4 
bis J + Vat 483 


(choose either + or — provided C 4 0) 
Here there can either be three, two, or one distinct solution, and the solutions can 
be either real or complex, much like the quadratic equation. 
Quartic equations. A quartic equation is an equation of the form: 
agX* + a3X? + agX*+a,X +a9 = 0 (where a4, a3, 42,41, a9 € R) 


The general solution for the quartic equation is even more complicated than the 
equation for the cubic. You definitely do not need to know it, but in case you are 
curious here it is: if ag £ 0, then the four solutions are: 


a3 1 q 
= eas 4S? — 2n 4 
ae re | aa 
+t 
’ * = 9 


where 
8a4aq — 3a3 
8az 
a’ — daga + 8a? 
3 44302 a4ay 
3 
8a; 


| aa 
2 


Qi= 
Ag := as — 3a3a1 + 12a4a9 
Ay i= a, — 9a3aga, + 27aza0 + 27asa% — 72a4a2a0 


(with special cases if S = 0 or Q = 0) 
Quintic (and higher degree) equations. A quintic equation is an equa- 
tion of the form: 
asX° +a4X* + a3X? + agX* +a,X +a9 = 0 (where as, a4, 43,42, 41,00 € R) 


You might be expecting an even longer and more complicated formula for the five 
solutions to a quintic equation, but actually it is known that this is impossible. In 
fact, there is a theorem which tells us that this is impossible: 
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Theorem 0.0.2 (Galois). Suppose n > 5. Then there is no general formula using 
radicals Gf: Ys Yo .) which gives the solutions to 


AnX” + dn Xb +++» +a,X tao = 0 
in terms of the coefficients an,..., Qo. 


Of course, sometimes you will be able to solve for the solutions of a high-degree 
polynomial equation (for instance, x := 1 is a solution to X!°° — 1 = 0), but this 
is usually because the polynomial is carefully chosen in order to admit solutions 
you can find exactly. This is an exceptional case. In general, the only polynomial 
equations you can expect a guaranteed solution for is degree 1 (linear) and degree 
2 (quadratic). If we do encounter higher-degree polynomials in this class, they will 
be chosen so that it is possible to find exact solutions. However in general we will 
stick to degree 2 or lower. 


Conventions and notation 


In this class the natural numbers is the set N = {0,1,2,3,...} of nonnegative 
integers. In particular, we consider 0 to be a natural number. 


Unless stated otherwise, the following convention will be in force throughout the 
entire course: 


Global Convention 0.0.3. Throughout, m and n range over N = {0,1,2,...}. 


CHAPTER 1 


Linear algebra I 


Before commencing with differential equations, we begin with the first of three 
chapters on linear algebra. This might seem initially unrelated to differential equa- 
tions (like the one considered in Question 0.0.1) but we will soon find that linear 
algebra is intimately connected with many of the things we will do with differential 
equations and it is the best language to explain many different phenomena we will 
encounter. 


1.1. Systems of equations 


In this section we will give a crash course in the correct way to completely solve a 
system of equations (with any number of variables and any number of equations). 


Systems of equations. Here is an example of a system of equations: 
2X+Y = 1 


(1.1) X-Y=1 


This is a system of equations with two variables (X and Y) and two equations. A 
solution to is a pair (a, y) of real numbers, such that when we plug in « for X 
and y for Y, both equations are satisfied. We will recall how one solves using 
what we will call the naive method: 


SOLUTION TO (1.1). First we will multiply the second equation by 2 so that the 
coefficients on “X” are the same: 
2X+Y = 1 


2) 2X —2Y = 2 


Next we will subtract the first equation from the second equation to eliminate the 
second “X”: 


2X+Y = 1 
(1.3) Br 
Now we see that y := —1/3 is the only value for Y which works. Plugging this into 


the top equation yields: 

2X —1/3 = 1 andthus X = 2/3. 
Thus x := 2/3 is the only value for X that works. We conclude that (2,y) = 
(2/3, —1/3) is the only solution to (1.1). 


We call this the naive method because it relies on observations and ad hoc com- 
putations. We include it here mainly to jog your memory of how you might have 
previously learned to solve systems of equations. However, this method quickly 
becomes burdensome when you consider more variables and more equations. In the 


al 
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rest of this section, we will introduce the correct method you should use to solve 
these systems. At this point we make the following declaration: 


You should never again use the naive method 


to solve a system of equations. 


Instead you should commit to learning and using the method introduced below. 
Before we proceed, we will make a few more definitions: 


Definition 1.1.1. A system of equations (with m equations and n vari- 
ables) is a system 


ayXy + a2Xo+++++GinXn = b1 
1X1 + Go2Xq+-+++GanXn = bg 
(1.4) 
Ami X1 + Am2X2 ap ee np AmnXn = bm 
where b;,a;; € R for every i = 1,...,m and j = 1,...,n. A solution to the 
system (1.4) is an n-tuple (#1, 2%2,..., 2») of real numbers such that when you plug 
x; in for X; (for each 1 = 1,...,n), each equation is true. 


Example 1.1.2. The following system has 3 equations and 4 variables: 
X1,+2X9—-—3X3+X4 = 6 
2X, + Xq-—2X3—-—X, = 4 
6X2 +4X3-—X4 = 4 


and it is easy to check that (1/3,4/3, —1,0) is a solution (although there are other 
solutions as well). 


In general the goal will be to find all solutions to a system of equations, not just 
one single solution. 


Augmented matrices. Recall that in our solution to the system (1.1) above 
we first had the system 


2X+1Y=1 
2X —2Y = 2 
which then we transformed into the system 


2X +1Y =1 


e) OX, — LY = 1. 


Note also that every symbol colored in red has nothing to do with the specific 
numbers; the presence and locations of “X”, “Y” and “=” is always guaranteed 
to be exactly the same each time we transform the system. The only thing that 
matters for each system is what coefficients are in which spot. 


This brings us to the first major innovation linear algebra has to offer us for systems 
of equations: augmented matrices. An augmented matrix for a system of m 
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equations in n variables (such as (1.4) above) is a rectangular array with m rows 
and n+ 1 columns which stores all the coefficients of the system: 


a11 a12 "°° Qin by 
a21 a2 +: Gan | bg 
aml Am2 ae Amn bm 


Example 1.1.3. For example, the system 
38a+4b+c = 2 
a—de = 3 
has corresponding augmented matrix 
fi 4 1 | 
1 0 -5]3 
In other words an augmented matrix is nothing more than a compact storage device 


for an entire system of equations. Whenever you see a system of equations, you 
should also picture it’s augmented matrix, and vice versa. 


Henceforth, we will primarily use augmented matrices 


for writing systems of equations. 


Now we return to the main order of business which is to efficiently solve systems 
of equations (i.e., determine all solutions). Basically, we will learn how to play 
a game. The game is called Gaussian Elimination. The rules of the game are 
roughly as follows: 
(I) There are three legal moves (so-called elementary row operations) which we 
can use to transform one augmented matrix into the next augmented matrix. 
(II) When starting out, the firs("] goal is to transform your matrix into Row 
Echelon Form. 
(III) After getting to Row Echelon Form, the next goal is to continue to transform 
your matrix into Reduced Row Echelon Form. 
(IV) Once the matrix is in Reduced Row Echelon Form, it is very easy to read off 
all solutions to the original system. 


We will study these four things separately in the remainder of this section. 


Row operations. Suppose we have an augmented matrix 


Q11 a12 Qin by 
a21 a22 a2n be 
aml Am2 aoa Amn bm 


The following elementary row operations are the only ways we are allowed to 
transform this augmented matrix: 
(1) (Row switching) A row in the matrix can be switched with another row 
in the matrix. Notation: Rj o R; 


lin some linear algebra books and classes, this step is skipped and the goal is to go directly 
to reduced row echelon form in (III). It’s fine if you do it that way, although in general it will take 
the same amount of work and effort. 
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(2) (Row multiplicaton) A row can be multiplied by a non-zero constant. No- 
tation: aR; — R; 

(3) (Row addition) A row can be replaced with the sum of that row and a 
multiple of another row. Notation: R; + aR; > R;. 


Here is an example of a sequence of three applications of elementary row operations: 


f 1 HG RiGRs F 4 4/3 
——> 


2 4 4/3 4 (row switch row 1 and row 2) 


3RisRi F 
2 


2 
0 1 
1 


Ry —2R2 Ry, 
— 
0 


(multiply row 1 by 1/2) 


; | sl (add —2 times row 2 to row 1) 


Question 1.1.4. Why are these the only operations allowed? 


PROOF. These row operations have the property that they are reversible. This 
means that the set of solutions remains the same in each augmented matrix. Note 
that if we allowed “multiplication by 0” to be a row operation, then this would 
have the effect of deleting information in the system and it might introduce addi- 
tional solutions which are not solutions of the original system (which would be very 
undesirable). 


Below we will explain how to use these row operations to achieve our objective of 
solving the original system of equations. 


Row echelon form (REF). We will illustrate the entire process with the 
following example which we will occasionally check back in with: 
Example 1.1.5. Find all solutions to the system 
3X1, + 6X2+6X3 = 24 
(1.6) —6X, — 12X_ —- 12X35 —48 
6X, +12X2+10X3 = 42 


SOLUTION TO EXAMPLE|1.1.5} Part I. The first step is to rewrite the system (1.6) 
as an augmented matrix: 


3 6 6 24 
6 12 12 48 
6 12 10 42 


Now we need to know how are we supposed to transform our augmented matrix 
using the three elementary row operations. First objective is to transform our 
augmented matrix into row echelon form: 


Definition 1.1.6. An augmented matrix is in row echelon form (REF) if 


(1) every row with nonzero entries is above every row with all zeroes (if there 
are any), and 

(2) the leading coefficient of a nonzero row (i.e., the leftmost nonzero entry 
of that row) is to the right of the leading coefficient of the row above it. 
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Example 1.1.7. The following augmented matrices are in REF (with the leading 
coefficients underlined): 


1 0 0} 1 
4 3] 1 0 1 0/2 2 3 0/0 0 0)2 
E ial lo 3 1/8] 0 0 1/3 ei E a 
0 0 0/0 
The following augmented matrices are not in REF: 
1 0 Oj}1 0 1/40 
k : | 0 0 1)2 0 0] 0 
= 0 1 0/3 1 0] 0 


SOLUTION TO EXAMPLE|1.1.5} Part II. Our augmented matrix is not in row ech- 
elon form. In particular, the leading coefficients of the second and third row are 
directly below the leading coefficient of the first row, which is not allowed: 


3 6 6 24 
-—6 -12 12 | —48 
6 12 10 | 42 
To fix this, we need to use row addition with the first row to turn the leading —6 


and 6 of the second and third row into a zero: 


3 6 6 | 24 3 6 6|24 
@ 18 219:)) 1g) | Sees or LG 
6 12 10] 42 6 12 10) 42 
3 6 6 | 24 

es SO | 0 

0 0 -2)-6 


We are still not in row echelon form since we have a row of all zeros above a row 
with nonzero entries: 


3.6 6 | 24 
00 0] 90 
0 0 -2]-6 
To remedy this, we will switch rows 2 and 3: 
3.6 6 | 24 
ee 0: Gh 2 pa 
00 0] 0 


We are now in row echelon form and we are done this step. 


Once our augmented matrix is in row echelon form, we can make the following 
definition: 


Definition 1.1.8. Given an augmented in REF, a pivot is a leading coefficient in 
a nonzero row. 


For instance, the augmented matrix we arrived at in Example has two pivots, 
which we indicate in | boxes |: 


w 
a 
aD 
iw) 
a 


o 
o 
| 
bo 
| 
ron) 
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Pivots play an important role in Gaussian Elimination. The next step is to take 
our augmented matrix a little bit further to reduced row echelon form. 


Reduced row echelon form (RREF). The ultimate goal is to get our aug- 
mented matrix into reduced row echelon form: 


Definition 1.1.9. An augmented matrix is in reduced row echelon form (RREF) 
if 

(1) it is in row echelon form (REF), 

(2) every pivot is 1, and 

(3) every entry above a pivot is 0. 


Example 1.1.10. The following augmented matrices are in RREF: 


1} 2 0 
lo Glo] Jo o Gj) o F : 
0 0 O ffl 
The following matrices are in REF but not RREF: 
4) 3 ] 1 2 0 
E ae lo [3] 118 E i 


We now continue on with our main example: 


SOLUTION TO EXAMPLE|1.1.5} PART III. We see that the augmented matrix we 
left off with is not in RREF, only REF. This is because the pivots are 3 and —2, 


not 1 and 1, and also the underlined 6 should be a 0: 


3} 6 6 24 
O O |—2}]) —6 
0 0 O 0 


To remedy this, we use row multiplication to fix the pivot values, and then row 
addition to get rid of the 6: 


ie fe eS 
2 0 oe =e 
00 0/0 
12 2|8 
See |o 1g 1/3 
00 0/0 
12 0]2 
fete, lg: G1 | 
0 0 of0 


Finally we arrive at RREF. 


Once our augmented matrix is in RREF, it is easy to read off all solutions of the 
original system. 
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Getting the final answer from RREF. We will describe how to get the 
final answer from RREF first in terms of our main example: 


SOLUTION TO EXAMPLE|1.1.5} Parr IV. First recall that the first three columns 
correspond to the three variables X,, X2, and X3: 


Xi X. Xs 
1} 2 O 2 


0 oO jl 3 
0 O 0 0 


Since X; and X3 have pivots in their columns, X, and X3 are called pivot vari- 
ables and the first and third columns are called pivot columns. Since X2 does 
not have a pivot, it is called a free variable and the second column is called a free 
column. Now we read off the solutions using the following steps: 


(1) Each free variable is can be any arbitrary value. In this case, we will say 
that X2 = s, where s € R is any number we like. 
(2) Next we rewrite the augmented matrix as a system and solve for the pivot 


variables: 
X,+2X_ = 2 
X3 = 3 
0=0 
which simplifies to: 
X, = 2-2s 
X3 = 3 
We now have our final answer: every solution is of the form: 
X, = 2-2s 
Xo = 8 
X3 = 3, 


where s € R can be any number. We write the set of all solutions as 
follows: 
{(2 — 2s,5,3):5s€R} 

This way of describing the set of solutions is often called parametric 
form because it describes the solutions in terms of the free parameter s. 
Notice that there are infinitely many solutions, since there are infinitely 
many values of s. To get specific solutions, you can just choose values of 
s. For instance, s := 0 yields the solution (2,0,3), whereas s := 10 yields 
the solution (—18, 10,3). 


Example 1.1.11. In this example we will see what to do with 2 free variables. 
Suppose we are given some system which has the following RREF: 

Xi Xe Xs. Xa 
O jl} 2 =O 7 
0 O O |1 
0 OO O O 0 


oo 
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Then we have two free variables X, and X3, so we need to introduce two parameters 
s,t € R and set X; = s and X3 =t. Then the system becomes: 


Xo+2X3 = 7 


X4 = 8 
and so the general solution is: 
Xi, = 8 
Xqg = —2t+7 
X3 =t 
X4 8 


where s,t € R are arbitrary. We can write the set of solutions in parametric form 
as follows: 

{(s,-2t+7,t,8):s,t € R} 
Note that to get a specific solution, we are free to choose any s and any t we like. 
For instance, s = 1,¢ = 0 gives the solution (1,7,0,8) whereas s = 0,¢t = 1 gives 
the solution (0,5, 1,8). 
Example 1.1.12. We will give an example of a system with no solutions. Suppose 
we are given a system with the following RREF: 


X1 Xe 

1} 2 0 

0 O 1 
Converting this augmented matrix back to a system of equations yields: 


1X1 +2X2 = 0 
OX,+0X2 = 1 


We claim there cannot be any solutions. Indeed, if say (x1, 22) is a solution, then 
this would mean it satisfies both equations, in particular, the bottom equation. 
Then 02) + 0x2 = 1, i-e., 0 = 1. However this is always false. 


We conclude this section with some more terminology and some general facts: 


Definition 1.1.13. We say that a system of equations is consistent if it has at 
least one solution, and we say a system of equations is inconsistent if it does not 
have any solutions. 


Fact 1.1.14. Given a system of equations, exactly one of the following three things 
will happen: 
(1) The system has zero solutions (i.e., it is inconsistent). This happens when 
the RREF contains a row of the form 
[0 whee, iQ) | 1] 
because this corresponds to the equation 0 = 1 which can never be true. 
(2) The system has exactly one solution. This happens when the system is 
consistent and there are no free variables in the RREF. 


(3) The system has infinitely many solutions. This happens when the system 
is consistent and there is at least one free variable in the RREF. 
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In fact, all 3 of the above cases can be determined once you’re in REF. If you only 
care about how many solutions there are (and not what exactly they are), then you 
can just stop once you get to REF. This is one of the benefits of going through the 
REF on your way to RREF. 


Here are some cardinal rules to always follow: 


(1) Always recopy the entire augmented matrix in each step, even if you are 
copying a row of zeros. It is important that the size of the augmented 
matrix (3 x 4 in our example) does not change. 

(2) Always denote which row operation you are performing in each step. 

(3) Always do one row operation at a time, at least when you are starting 
out. If you attempt to do multiple row operations in one step then this 
can lead to errors. 


Remark 1.1.15. Given a system of equations, we take it to RREF and obtain the 
set of solutions for the original system we started out with. However, this is actually 
the set of solutions for every system we encountered along the way. This is because 
the RREF of the original system also works as the RREF for every intermediate 
system. 


Geometric interpretation. When you are solving systems of equations, it is 
good to keep in mind the underlying geometric interpretation. Recall that a linear 
equation in two variables: 


27+ 3y = 1 
can also be viewed as an equation for a line in the plane (y = —3a + 4). Thus, a 
system of linear equations: 
27+ 3y = 1 
sx+7y = 2 
ct+ty = 3 


is really asking us to find all points (a, y) in the plane which are part of all three lines, 
i.e., we want to know where do these three lines intersect, if at all. If we are consider 
three variables, then we are asking where do multiple planes simultaneously inter- 
sect, if at all. For more than 3 variables, we are asking where do higher-dimensional 
hyper-planes intersect in higher-dimensional euclidean space (something difficult to 
visualize). 


In Figure [1-1] we consider five systems of equations, where each one has two vari- 
ables and three equations. You can see that there are different ways that the cases 
no solutions, exactly one solution, and infinitely many solutions can arise. 


18 


10 1. LINEAR ALGEBRA I 


(A) Unique solution (B) Infinitely many solutions 


(c) No solutions (D) No solutions 


(E) No solutions 


FIGURE 1.1. Possible intersections of three lines in a plane 


Some specifics about terminology. In this section, we have only been work- 
ing with augmented matrices, for instance 


(1.7) h : i 


An augmented matrix is just a special example of a matrix with a vertical bar 
which superficially separates the columns. A matrix (with m rows and n columns) 
is a rectangular array of numbers: 


a11 a12 Gin 
a21 a22 a2n 
Am1 aAm2 pot Amn 
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For instance, the augmented matrix (1.7) is considered a 2 x 3 matrix. When 
discussing an augmented matrix, we will always consider every column as part of 
the augmented matrix. If we want to refer only to the entries to the left of the 


vertical bar: 
1 2 
4 5 


this will be referred to as the coefficient matrix (of the linear system). 


Definition 1.1.16. Here are some precise definitions summarized: 
(1) Given a matrix, a leading entry of a row is the leftmost nonzero entry (if 
there is one). In the following matrices, we underline the leading entries: 
i) ery 
0 2 0 O/;1 


ee) 


(2) If a matrix is in REF, then the leading entries are also called pivots. The 
following matrices are in REF and the pivots are in boxes: 


2} 3 5 | 4 O jl) O O 
O O |7}| 1 0 O j1} O 
0 0 O | |2 0 0 O ;1 
0 0 07) 0 0 0 O 0 


(3) If a matrix is not in REF, then we choose not to define what a pivot 
is. In this class we will only discuss “pivots” in the context of Gaussian 
Elimination and only allow ourselves to refer to “the pivots of a matrix” 
if we know the matrix is already in REF. For all matrices, the expression 
“leading entry” will always make sense, regardless of whether the matrix 
is in REF or not. 

(4) We define the rank of a matrix to be the number of pivots any REF of 
that matrix has (it will be the same number even though there could be 
many different REFs). 


Question 1.1.17. Why are we reluctant to call leading coefficients in a non-REF 
matrix “pivots”? 


Answer 1.1.18. In general, a pivot (noun) is something that you pivot (verb) 
around. Given a nonzero entry of a matrix, to pivot around that entry means 
to use elementary row operations to turn that entry into a 1 and then use it to 
turn the other entries in that column into 0. In the following example, we pivot 
around the boxed entry (for no particular reason other than to show an example of 
“pivoting” ): 


I 1 1 1R.> Re 1! . R,—R2->R 
, (ool 2 Se 4 (|) 
5 3 2 zs 9 3 
0 0 0 
1 fi a he al i 
3 Ss 0 0 0 
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Since this is what “pivoting” means, we define pivots so that in Gaussian Elimina- 
tion we are essentially pivoting around the pivots. We do not pivot around the lead- 
ing entries which are not pivots. Furthermore, there are other algorithms in linear 
algebra besides Gaussian Elimination (for instance, the Simplex Algorithn(?) where 
you pivot around entries which are not leading coefficients. Thus, you shouldn’t 
get too attached to the idea “pivot means leading entry”. 


Given the above discussion, we can now recast some of the above facts in more 
detail: 


Fact 1.1.19. Suppose we are considering a system of equations which has aug- 
mented matrix: 


a11 a12 "°° Qin by 
a21 a2 +: Gan | bg 
aml Am2 Amn bm 
and coefficient matrix: 
a11 a42 Gin 
a21 a22 a2n 
Am1 aAm2 irre Amn 


(1) The following are equivalent: 
(a) the system has no solutions, 
(b) the system is inconsistent, 
(c) an REF of the augmented matrix has a row of the form 


[0 of 4a, 
(d) the RREF of the augmented matrix has a row of the form 
[0 wie | 1] , 


e) an REF of the augmented matrix has a pivot in the last column, 
f) the RREF of the augmented matrix has a pivot in the last column, 
(g) the rank of the coefficient matrix is not equal to the rank of the entire 
augmented matrix. 
(2) Suppose the system is consistent. Then the following are equivalent: 
(a) the system has exactly one solution, 
) every variable is a pivot variable, 
(c) there are no free variables, 
) the rank of the augmented matrix is equal to the number of columns 
in the coefficient matrix (= number of variables). 
(3) Suppose the system is consistent. Then the following are equivalent: 
(a) the system has infinitely many solutions, 
(b) at least one variable is a free variable, 
(c) the rank of the augmented matrix is less than the number of columns 
in the coefficient matrix (i.e., less than the number of variables). 
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1.2. Application: partial fractions 


In this section, we revisit the powerful method of partial fractions, viewed as an 
application of linear systems. 


Case I: distinct linear factors. Suppose we want to integrate the rational 
function: 
3a +4 
x? — 3x? + 2x 


To do this, we must first factor the denominator polynomial: x? — 32? + 2x = 
(a — 0)(a — 1)(a — 2). Since there are no (strictly) complex roots, this polynomial 
factors into linear factors (with real roots). Also, for this polynomial, every linear 
factor is distinct (occurs with multiplicity one). Thus, the general form of the 
partial fraction decomposition is: 

3a +4 A B C 


a(x—1)\(x-2) «x zx-1 


where A, B,C € R are three unknown real numbers we need to solve for. Clearing 
denominators yields: 


3a+4 = A(x—1)(x—2)4+ Bau(a — 2)+ C(x — 1)(x — 2) 


This equality is to be interpreted as: for every possible real number x € R, when 
you plug x into both the lefthand side and the righthand side, you should get a true 
equality of two numbers. We will use this observation and plug in three carefully 
chosen numbers to see what they give us: 


e (x = 0) In this case, the equation becomes 4 = 2A 
e (x = 1) In this case, the equation becomes 7 = —B 
e (x = 2) In this case, the equation becomes 10 = 2C 


Thus, we have arrived at a (easy) system of equations: 


2C = 10. 


We can solve this system using Gaussian Elimination: 


22818) mamcmeemdann, [2 81 
0 O 2} 10 0 0 1] 5 


This gives us the unique solution (A, B,C) = (2,—7,5). We conclude that 


3a+4 2 7 5 


ev —302+27 «x «x-1l r—2 


is our desired partial fraction decomposition. The rational function can now be 
integrated using the logarithm. 
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Case ITI: repeated linear factors. Suppose now we wish to decompose 


5a? + 6x? +72 +8 
wt — Qe3 + a? 


We are able to factor the denominator as x+ — 2x° + 2? = x?(x — 1)?. We see that 
there are two linear factors, each one with multiplicity two. Thus the general form 
of the partial fraction decomposition is 
bee +607 +7e+8 _ Be C “b D 
x? (a — 1)? — ae g%  g-1 (x—1)? 

where A, B,C, D © R are four unknown real numbers we need to solve for (the rule 
is, for each multiplicity of a linear factor, you get another term in the expansion 
and another variable). First we cross-multiply so that we have an equality of 
polynomials, then we rewrite the righthand side as a single polynomial: 


ba? + 6a" + 7a+8 = Aa(2e—1)?+ Ble —1)? + C2z*(e-1)+ D2? 
= A(c* — 20? +0) -+ B(x? — 22 +1) + Cla? — x?) + Dz? 
(A+C)z? + (-2A+ B-C+D)a?+(A-2B)zr +B. 


l| 


Next, we use the important observation that two polynomials are the same if and 
only if they have the same degree and the corresponding coefficients are the same. 
Thus the above equality of polynomials yields the system: 


A+C = 5 
-2A+B-C+D = 6 
A-2B =7 

B= 8. 


We can now solve the system using Gaussian Elimination: 


| iY Or ro | [} 23 ] 
—2 1 -1 1/6 to RREF (steps omitted) 0 8 
1) 22 SO. y lo —18 
0 1 0 078 0 26 
We find that the unique solution is (A, B, C, D) = (23,8, —18, 26). Thus the desired 
partial fraction decomposition is 


5a? + 6a? + 7x + 8 23 «8 18 26 


gt—Qe3+a2 ~~ g¢ 2% 2-1 (x—1)? 


oor oO 
oreo 
ne) 


Case III: irreducible quadratic factors. Technically speaking, if you are 
comfortable working with complex numbers and complex-valued functions, then 
you only ever have to consider factorizations of the denominator into linear factors. 
However, for various reasons it is convenient to have a method of partial fraction 
decomposition which does not require us to ever leave the realm of real numbers. 
For instance, for the following rational function 

10a? + lla + 12 
(+) @+1) 
we could factor the denominator into linear factors 


(2? +.1)(2+1) = (x+4)(x —1)(x 4+ 1), 
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and then proceed as in Case I (which we’ll do below just to prove a point). However, 
we can just as easily keep the quadratic factor x? + 1 as is in our computation. 
Since in general the number of unknowns in a partial fraction decomposition must 
be equal to the degree of the denominator polynomial, the quadratic factor has to 
contribute two unknowns to the general form: 


10x? + 1llaw + 12 Ac+B C 


(22+ 1)(2+1) 2241 x41 
We now proceed as in Case II by clearing denominators and getting an equality of 
two polynomials: 


10x? + lla +12 = (Ar+B)(z+1)+C(z? +1) 
(A+C)x?+ (A+ B)x+(B+C) 


This gives us a system of equations: 


I 


A+C = 10 

A+B= 11 

B+C = 12 

which we can solve using Gaussian Elimination 
1 0 11410 ae eas 1 0 0| 9/2 
1 Oh) es 0b O82 
0 1 1/12 0 0 1) 11/2 


This gives us the desired partial fraction expansion: 
102? + lle +12 9x + 13 11 


(w+1)(a@+1) 2(@2 41) | 2Ae+1) 
We can check our work by re-doing the decomposition with complex numbers: 
1027 + lle +12 Ae = AB C 


(a+i)\(2—i\(e+1) wxti' r-i etl 


Cross-multiplying yields 
10x? + 1lz+12 = A(x—i)(2+1)+ B(x +i)(x +1) + C(x — i)(x +1) 


Now we plug in the three denominator roots to get linear equations for the un- 
knowns: 


e (x = —1) In this case, the equation becomes 2 — 11i = (—2 — 2i)A 
e (x =12) In this case, the equation becomes 2 + 11i = (—2+4 2i)B 
e (x = —1) In this case, the equation becomes 11 = 2C 


This yields the system: 
(-2-—2i)A = 2-1li 
(-2+ 21)B 2+ 11: 
2C = ll 


I 


which we can solve with Gaussian Elimination: 


S28 20) OD SU | pce een, LOO) Cadsn 4 
0 294975 0. Oe | > 10 1 O| (9—13%)/4 
0 O° ay a1 001 11/2 
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This yields the desired partial fraction decomposition: 
10x? + lla + 12 9+13i 9-13% 11 


= oe ea 
(x +i)(x —i)(a +1) A(x +i) 4(a@—-i) 2(x+1) 
Finally, to pull this decomposition back into the realm of real numbers, we add 
the first two fractions together (since those two correspond to a conjugate pair of 
roots): 


13-9 9-138 , WL _ (9+ 13i)(e-1) +(9-13i)(e +i) | 11 
A(a+i) ° 4(x—i) ° Qe+1) 4(ax + i)(x — i) ' 2(a +1) 
9x + 13 11 


+ 
2(a2 +1)  2(a+1) 
This shows that working with complex numbers gives the same decomposition. 


Case IV: repeated quadratic factors. Finally, we arrive at perhaps the 
most involved case: repeated quadratic factors. However, the method here is really 
just the same as the methods in Cases II and III provided you know the rule for 
the general form. Here is an example: 

6x3 + 7x? + 8x +9 
(ae 1)? 
Since the quadratic factor 2? +x +1 has multiplicity two, it has two show up twice 
in the decomposition. Since the total number of unknowns needs to be four (= 
degree of denominator polynomial), each occurrence of the quadratic factor has to 
have two unknowns: 
6x3 + 7x? + 8x +9 Ar+B | Cxr+D 


(a? +a”+4+1)? ~ @tatl1 (#2 +241)? 
Just as before, we cross-multiply and get an equality of polynomials: 
62° + 7a? + 824+9 = (Ar+ B)(e?+24+1)+Czr4+D 
= Ag? +(A+B)z?+(A+B+C)r+(B+D) 


Equating the two polynomials gives us the system of equations: 


A =6 
A+B=7 
A+B+C = 8 
B+D 9 
which we can solve using Gaussian Elimination: 
1 0 0 0/6 1 0 0 0|6 
110 O|7 to RREF (steps prasad) 041 0 O01 
1 11 0/8 0 0 1 Oj1 
0 10 149 00 0 14/8 
This gives us the desired partial fraction decomposition: 
6x3 + 7x? + 82 +9 _ 6r+1 4. r+8 
(a? +a”+41)? — gt ta4+1 (22? +241)? 
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CHAPTER 2 


Calculus review 


In this section we will summarize all the important definitions and results from 
calculus. In general we will state these results for arbitrary nice functions, for 
summary of calculus results pertaining to special elementary functions, see Appen- 
dix [A] First, some terminology which will simplify some things. Given the set of 
real numbers R, we artificially adjoin two new symbols +co and —oo to serve as 
convenient bookends of the ordering. More specifically: 


Definition 2.0.1. Define the extended real numbers to be the set Ri. := 
RU {—o0, +00}. We extend the ordering on R to all of Ri. by declaring: 


—oo <a<-+oo for every a € Rio. 


Unless we state otherwise, we do not extend the arithmetic operations +,- on R 
to include +oo. It is important to realize the new elements too are not numbers 
and there is not supposed to be anything super deep or special about adjoining 
too to our real line. We primarily introduce it because it makes certain commonly 
occurring statements and expressions shorter. 


For instance, we can define bounded intervals and unbounded intervals with uniform 
notation. Given a,b € R such that a < b, an interval is a set of one of the following 
forms: 


(a,b) := {mE R:a<a< bd} 
[a,b) := {te R:a<a< bd} 
(a,b) :-= {tf ER:a<a< bd} 
[a,b] := {we R:a<a<b} 

(a, too) := {#E€R:a<a} 

[a, too) := {te R:a<a} 

(—o0,b) := {rE R:2 <b} 

(—o0,b] := {x ER: a <b} 

(—00, +00) := R 
Intervals of the form (a,b), [a,b), (a, 6], [a,b] are called bounded intervals. In- 


tervals of the form (a, +00), [a, +00), (—0o, b), (—00, b], (—co, +00) are called un- 
bounded intervals. Intervals of the form (a, b), (a, +00), (—00, b), (—oo, +00) are 
call open intervals. Intervals of the form [a,b], [a, +00), (—oo, b], (—o0, +00) are 
called closed intervals. 


Of course, intervals are not the only types of subsets of R which naturally arise 
in this class. For instance, the natural domain of the tangent function is not an 
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interval, but instead a union of intervals: 


domain(tant) = {t¢€ R:t4#7a/2+7k for every k € Z} 


=a @ (F + 1k, 2 + -a(k | \) 
keZ 


In order to avoid too many technicalities, we will consider a subset D C R to be 
nice if can show up as the true domain of some function one would encounter in 
freshman calculus. To be specific: 


Definition 2.0.2. We call a set D C R nice if it is an interval or a union of a 


sequence of intervals, i.e., if there exists a sequence of intervals Ip, 1), I2,... such 
that 
Dre il! Jd, 
n>0 


In general we will always restrict our attention to functions with nice domains, with 
the domain of the tangent function being representative of the worst type of nice 
domain. If you find the definition of nice too technical, then surprisingly very little 
is lost if you just interpret the adjective nice in the colloquial sense. Really, these 
things won’t matter too much for this class (since you’re being graded primarily on 
learning how to do calculations), but we introduce this terminology anyway so that 
way in these notes we can still restrict ourselves to making statements which are 
literally true in a mathematical sense, without being overly abstract and technical. 


In the exposition we will occasionally refer to elementary functions. We don’t 
mean anything too precise by this, although you can take the following as a rough 
definition: 


Rough Definition 2.0.3. An elementary function f : D — R is any function 
constructed from the following operations: 


(1) arithmetic operations: +,—,-, / 

(2) algebraic operations such as taking nth roots 

(3) composition of functions 

(4) the exponential exp : R > R and logarithm In : [0, +00) > R, 
(5) the trigonometric functions sin, cos, tan 

(6) the inverse trigonometric functions arcsin, arccos, arctan 


In other words, an elementary function is the type of function which shows up in 
freshman calculus. 


2.1. Limits 


In this section D is a nice set. We will review the definition and rules for computing 
limits. Recall that sometimes, even if a function f : (a,b) > R is defined on an 
open interval (a,b), it sometimes still makes sense to ask what is the limit of f(x) 
as « > a, ie., limz, f(x), even though f is not defined at a. This makes sense 
because a is an endpoint of (a,b), so there are points in (a,b) which are arbitrarily 
closed to a. In general we will consider functions f : D + R where the domain D 
is a nice set. Before we define limit, it first makes sense to define what is the set of 
all points which it might make sense to take the limit to. 
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Definition 2.1.1. Define the closure of D to be the slightly larger set cl(D) > D 
defined such that for every a € Ri, we say that a € cl(D) if there exists  € R 
such that either: 


(1) « <q and (a,a) C D, or 

(2) a<aand (a,x) C D. 
In particular, if a € D, then a € cl(D). In other words, cl(D) is the same thing as 
D plus all the endpoints of the intervals which define D. For example: 


el ((1,.2]) = [1,2] 
cl ((—1,0) U (0, 1]) [-1, 1] 
cl(domain(tant)) = R 


We can now define in one definition every type of limit of a function encountered 
in freshman calculus: 


Definition 2.1.2. Suppose f : D > R is a function with nice domain D. Suppose 
a € cl(D) and L € Rix. We say the limit of f as x approaches a exists and is 
equal to L, notation: 


lim f(z) = L 
if one of the following is satisfied (depending on whether a, L = +00 or not): 

(1) (a, £ € R) for every € > 0, there exists 6 > 0 such that for all x € D, if 
0 < |x —a| <6, then | f(x) — L| <e. 

(2) (a = +00, L € R) for every « > 0, there exists M € R such that for all 
xé€D,if M <a, then |f (x) — L| <e. 

(3) (a = —co, L € R) for every € > 0, there exists M € R such that for all 
a € D,ifx< M, then | f(x) — L| <e. 

(4) (a € R,L = +00) for every M € R, there exists 6 > 0 such that for all 
x €D,if0<|x-—a| <6, then M < f(z). 

(5) (a = L = +00) for every M € R, there exists N € R such that for all 
xéD,ifN <a, then M < f(z). 

(6) (a@ = —oo, L = +00) for every M € R, there exists N € R such that for 
alla é D,ifa< N, then M < f(z). 

(7) (a € R,L = —ov) for every M € R, there exists 6 > 0 such that for all 
x € D,if0<|x%-—a| <6, then f(x) < M. 

(8) (a = +00, L = —oo) for every M € R, there exists N € R such that for 
alla € D, if N <a, then f(a) < M. 

(9) (a = L = —ov) for every M € R, there exists N € R such that for all 
xe D,ifa< N, then f(x) < M. 


In general, for this class if and when we compute limits, we will not use directly 
Definition Instead we will use known formulas for limits of special functions 
(see Appendix|A) along with various limit laws, including facts about continuity. 


Here is the general limit law for sums of limits: 


Addition Limit Law 2.1.3. Suppose f,g : D — R are functions where D is a 
nice domain. Further suppose a € cl(D) and the limits 


jim f(z) = Ly and Jim g(x) ae oe 
exist with Ly, Lg € Roo. Then: 
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(1) if Ly, L, © R, then 
lim (f-+9)(0) = Ly +L, 
(2) if Ly =+00 and Ly # —c~, or Lg = +00 and Ly # —ov, then 
lim (f + g)(@) = +00 


wa 
(8) if Ly =—c and Lg # +00, or Lg = —co and Ly # +00, then 
lim (f + g)(x) = —00 


(4) if Lp = +00 and L, = —o0, or Lp = —00 and Ly = +00, then more subtle 
investigation is needed (l’Hépital’s rule). 


Here is the general limit law for products of limits: 


Product Limit Law 2.1.4. Suppose f,g:D— R are functions where D is a nice 
domain. Further suppose a € cl(D) and the limits 
Jim f(x) = Ly and Jim g(a) = L, 
exist with Le, Lg € Rtoo. Then: 
(1) if Lz, Lg € R, then 
lim(f-g)(w) = Ly Ly 
(2) if one of the following is true: 
(a) Ly =+00 and L, > 0 
(b) Ly =—co and Ly <0 
(c) Le <0 and Lg = —oo 
(d) Ly > 0 and Ly = +00 
then 


lim(f-g)(2) = +90 
(3) if one of the following is true: 

(a) Lp =—c and L, > 0 

(b) Ly =+00 and Ly <0 

(c) Ly <0 and Lg = +00 

(d) Ly >0 and Ly =—c 

then 


I 
| 
8 


lim (f-)(2) 
(4) if one of the following is true: 

(a) Lp =0 and Lg = +00 

(b) Ly = +00 and Ly = 0, 

then more subtle investigation is needed (l’Hépital’s rule). 


Finally, here is the general limit law for quotients of functions: 


Quotient Limit Law 2.1.5. Suppose f,g : D— R are functions where D is a 
nice domain. Define the set: 
D> {ee Di ge) 70} GR: 
Assume that D’ is also nice (for us it always will be) and suppose for a € cl(D’) C 
cl(D) the limits 
lim f(z) = Ly and lim g(x) = L, 


wm ra 
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exist with Le, Lg € Roo. Then for the quotient function: 


Lip sr 
g 


we have: 


(1) if Lp € R, and Lg € R and L, £0, we have 


(2) if Ly F +00 and Ly = +00, we have 


in (4) a =6 
rea g 
(3) if Ly =+00 and L, > 0, or Ly =—oo and Ly < 0, then 


lim (4) (2) = +00 
rea \g 
(4) if Ly =+00 and L, < 0, or Ly =—co and L, > 0, then 
lim (4) (x) = +00 
rea \g 
(5) otherwise a more subtle investigation is needed (l’Hépital’s rule). 


Multivariable functions. We will also need to occasionally consider func- 
tions with multiple variables: 


F(t,y) where F:D—+R DCR? is asubset of the ty-plane 


We will not attempt to define what a “nice” subset D of the plane is, although 
most of our domains will be of the form D = I x J, where I and J are intervals 
(such a set could be a called a rectangle). Ultimately, we will not be in the 
business of computing limits of multivariable functions in this class, although here 
is a definition anyway: 


Definition 2.1.6. Suppose F' : D —> R is a two-variable function with domain 
D CR? anice subset of the ty-plane (think D = I x J, a rectangle). Given a real 
number L € R and a point (to, yo) € D, we say that L is the limit of F as (t, y) 
approaches (to, yo), notation: 


lim Ft, = [ 
(t,y)— (to ,yo) (ty) 


if: for every € > 0, there exists 6 > 0, such that for every (t, y) € D, 
if 0 < V/(t— to)? + (y— yo)? < 6, then | F(t, y) — L| <.. 


Even if we were computing multivariable limits in this class, we would rarely use 
Definition directly and instead rely on limit laws and facts about continuity. 


Limit Laws for Multivariable Functions 2.1.7. Suppose F,G : D > R are 
two two-variable functions defined on a nice domain and suppose (to,yo) € D. 
Furthermore, suppose Ly, Lg € R are such that 


lim F(t,y) = Lr and lim  G(t,y) = Le. 
(ty) (to yo) (ty) (to yo) 


Then: 
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(1) limee,y)+(to,y0) (F + G)(t,y) = Lr + Le, 
(2) lim(t,y)+(to,yo) (F + G)(t,y) = Le « La. 
Furthermore, define 
D' := {(t,y)€D: G(t,y) 40} 
then, if (to, yo) € D’ and Lg £0, we also have: 
(3) lim¢t,y)+(to,y0) (F/G) (t, y) = Le/Le.- 


2.2. Continuity 


The most basic property we might wish for a function f : D > R to have is that it 
is continuous. Here is the definition: 


Definition 2.2.1. Suppose f : D —> R is a function with nice domain D C R. We 
say that f is continuous if for every a € D, 


lim f(x) = f(a), 


Example 2.2.2. Here are some continuous functions: 
(1) Every constant function «++ c: R > R (where c € R) is continuous. 
(2) The identity function 7H x: R > R is continuous. 
(3) The absolute value function x +> || := Va?: R —- R is continuous. 
(4) The square root function 7 ++ \/x: [0, +00) > R is also continuous. 


The following shows how continuity is preserved under the basic arithmetic opera- 
tions: 


Proposition 2.2.3. Suppose f,g: D — R are continuous functions on a nice 
domain D. Then the following functions are also continuous on D: 


(1) ftg: DOR, 
(2) f-g: DOR 


Furthermore, define the set 
D! := {x€D: g(x) £0} 
and assume that D' is nice (for us it always will be). Then 


(3) f/g: D' > R is continuous. 


The following tells us that continuity is preserved when you compose two compos- 
able continuous functions: 


Proposition 2.2.4 (Composition and continuity). Suppose f: D — R is continu- 
ous with nice domain D and gq: E — R is continuous with nice domain E such that 
f(D) C E. Then go f: D> R is continuous. 


Combining Example |2.2.2{3) with Proposition gives us: 


Corollary 2.2.5. If f: D — R is continuous with nice domain D, then so is 
|f|: DR, given by 


lfl(z) :=|f(a)|, fora e D. 


The following is an important theorem about continuous functions: 
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Intermediate Value Theorem 2.2.6. Suppose f: [a,b] > R is continuous, with 
a<beR. Let y be a number strictly between f(a) and f(b), «e., 


fa) <y< f(b) or f(b) <y< f(a). 
Then there is xo € (a,b) such that f(ao) = y. 


The following lemma says that if a continuous function is nonzero at a point, then 
it must be nonzero on a neighborhood of that point: 


Bump Lemma 2.2.7. Suppose f : I > R is continuous, I C R is an interval, 
and ty € I is such that f(to) 4 0. Then there is a < to < 6 such that for every 
te (a, A), ft) #0. 


Monotonicity and inverses. In this subsection, we discuss monotone func- 
tions, the existence of inverse functions, and when inverse functions are continuous. 


Definition 2.2.8. Suppose f : D > R is a function where D C R is a nice set. We 
say that f is 

(1) increasing if for all x,y € D, if a < y, then f(x) < f(y), 

(2) strictly increasing if for all x,y € D, if x < y, then f(x) < f(y), 

(3) decreasing if for all x,y € D, ifa <y, then f(x) > f(y), 

(4) strictly decreasing if for all z,y € D, if a < y, then f(x) > f(y). 
Furthermore, we say that f is monotone if it satisfies any of properties (1)-(4), 
and we say that f is strictly monotone if it satisfies property (2) or (4). 


Definition 2.2.9. Suppose f : D > Ris an injective function (see Definition|B.5.1), 
and D C R is a nice set. We define the inverse function of f to be the function 
f~' : range(f) + R defined by: 

fTy =e x= fe =y 
for all x € D and y € range(f). 


Strictly monotone functions are a big source of injective functions: 


Theorem 2.2.10. Suppose f : D > R is a strictly monotone function and D is 
a nice set. Then f is injective and so it has an inverse function f~!: f(D) > R. 
Moreover, if one of the following holds: 

(1) f is continuous, or 

(2) D is an interval, 


then f—+ is continuous and strictly monotone. 


Multivariable functions. There is also a definition of what it means for a 
multivariable function to be continuous: 


Definition 2.2.11. Suppose F': D — R is a two-variable function with domain 
D CR?’ a nice subset of the ty-plane. We say that F is continuous (on D) if for 
every point (to, yo) € D, we have: 
lim Fi(t,y) = F(to, yo). 

eae en 
Most of the multivariable functions we will consider will be continuous, and their 
continuity can be determined by using the following rules, as well as the continuity 
of the underlying single-variable functions: 
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Continuity Laws for Multivariable Functions 2.2.12. Suppose F,G: D—~R 
are continuous functions with domain D C R? a nice subset of the ty-plane. Then: 


(1) (Projection functions) the functions f(t,y) = t and g(t,y) = y are con- 
tinuous, as functions f,g: DR. 
(2) (Linearity) Given arbitrary a, 8 € R, the function: 
aF+B8G:D—>4R 


is also continuous. 


(3) (Products) The function: 
F-.G:D>R 


is also continuous. 


(4) (Quotients) Define the set 
D' := {(t,y)€ D:G(t,y) 4 0} 
Then the function: 
F 
—:D'=R 
G 
is also continuous. 

(5) (Compositions) Suppose f : E — R is a continuous one-variable function 
where E C R is a nice domain. Furthermore, suppose F(D) C E. Then 
the composition: 

foF:D->R 


is also a continuous function. 


2.3. Differentiation 


In this section D C R is a nice set. Given a function f : D > R, if it is differentiable 
at a point in its domain, then that means the function f can be approximated 
suspiciously well by a linear tangent line at that point. The following proposition 
gives three equivalent ways of saying exactly this: 


Proposition 2.3.1. Suppose f: D— R is a function anda € D. The following 
are equivalent: 
(1) (Standard definition) The limit 
san f)— Fla) _ 
ra rL—a 
exists and is finite (t.e., €€ R). 
(2) (Taylor definition) There exists a number d € R and a function R: D> R 
such that 
_ R(x) 
f(x) f(a)+d(a—a)+ R(x) an im 0 
(3) (Carathéodory definition) There exists a function q: D > R which is 
continuous at a such that 
f(x) = fla) +4(x)(w — a). 
Furthermore, if any (equivalently all) of (1), (2), and (3) holds, then 


(4) €=d=q(a), and 
(5) f is continuous at a. 
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Definition 2.3.2. We say that function f : D — R is differentiable on D, if for 
every a € D, the equivalent conditions of Proposition hold. In this case, we 
define the derivative of f at a to be 


f'(a) -= lim f(x) ZZ f(a) : 
Vinge xr—-a 
In this class, since we will be working with special elementary functions and not 
arbitrary differentiable functions, we generally will not have to use the formal def- 
inition when computing derivatives. In general we will be able to compute all 
relevant derivatives by employing the following rules as well as the known formulas 
(see Appendix|A) for the derivatives of the functions we care about. 


Example 2.3.3. (1) Constant functions are differentiable with derivative 0. 
(2) Let f: RR be such that f(x) = 2”. Then f is differentiable, and and 
for every a € R f(a) = na”"~!. To see this, note by The Difference of 


Powers Formula, 
f(x)-f(a) = 2-0" = (x-a):(a2"™ "+00" 7 +072" 3 4--- +0" 7z+a""), 


thus for z 4 a, we have 


f(x) — fla) = o™ 1b ag”? 4 g2g™3 4... 4072p t ar}, 
L-a 
and so 
tim J@=F) _ gn, 
ra r—-Q 


The following rules show how computing the derivative interacts with the basic 
arithmetic operations: 


Proposition 2.3.4. Suppose f,g: DR are differentiable on D. Then 
f+g,f-9:DOR 


are differentiable on D, and for every a € D 


(1) (f+ 9)'(a) = f(a) + 9'(a), 
(2) (product rule) (f - 9)'(a) = flajg'(@) + f'(a)g(a), 
Furthermore, with D! := {a € D: g(x) £0} C D, if D’ is nice, then the function 


:D' oR 


Kai any 


is differentiable and 
(3) (quotient rule) for every a € D' 

f\'. _ gla) f(a) = f(a)g'(@) 

ai (a) im 2 

g g(a) 
Remark 2.3.5. An immediate consequence of Proposition|2.3.4{/1) and (2) is that 
if we have constants c,d € R and differentiable functions f,g: D — R, then 

(cf +dg)’ = cf'+dq’. 


In linear algebra terms, differentiation is R-linear (i.e., it is a linear transformation 
on the R-vector space of differentiable functions D > R). 


Differentiation also behaves well with composition of differentiable functions: 
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Chain Rule 2.3.6. Suppose f: D— R, g: E — R are differentiable functions 
such that f(D) C E. Thengof:D—R is differentiable, and for every a € D 


(gof)'(a) = g'(f(a))- f"(a). 
In theory, you should be able capable of computing the derivative of any elementary 
function provided you know the rules and as well as the formulas for the 


derivatives of the primitive functions of interest given in Appendix [A] Of course, 
this should not be news to you. 


The following is a very useful consequence of the so-called Mean Value Theorem for 
Derivatives. Note that Corollary and Identity Criterion are only true 
when the domain is an interval. 


Corollary 2.3.7. Suppose D is an interval and f: D— R is differentiable. Then 
f is a constant function iff f’(~) =0 for alla € I. 


A common question we might ask when it comes to uniqueness of solutions of ODEs 
is: when are two functions f,g: I > Rthe same? If f and g are differentiable (which 
pretty much all of our functions will be), the following makes this question easier 
to answer: 


Identity Criterion 2.3.8. Suppose D is an interval and f,g: D — R are differ- 
entiable such that f'(a) = g'(a) for every a € D. Then there exists a constant 
C ER such that f(x) = g(x) + C for alla € D. Furthermore, if there is a point 
xo € D such that f(x%o) = g(axo), then f(x) = g(x) for all x € D. 


Proor. The function f — g: D > R is differentiable by Proposition and 
(f —g)'(«) = f' (a) —g'(x) = 0 for all € D. By Corollary [2.3.7 there is a constant 
C ER such that (f — g)(x) =C for all x € D, ie., f(x) = g(x) + C for all x € Dz 

Now, suppose there is 29 € D such that f(a) = g(ao). Then also f(a) = 
g(xo) + C, so we can conclude that C = 0. Thus f(x) = g(x) for all x € D. 


Inverse functions and monotonicity. Sometimes differentiable functions 
are also invertible. In this subsection we talk about the differentiability of the 
inverse function. 


Theorem 2.3.9. Assume f : D > R is a differentiable injective function and 
D CR is a nice set. Define I := f(D) and 
I’ = {yel: f'(f-"(y)) 49} 
The the function f—! : I' + R is differentiable, and for every yo € I’ we have 
1 
(fo) = seam 
f'(£-*(yo)) 
We can also use derivatives to check for monotonicity, which enable us show that a 
function is invertible. 
Theorem 2.3.10. Suppose f: I > R is a differentiable function on an interval I. 
Then: 
(1) f ts increasing if f'(x) > 0 for alla € I, 
(2) f is strictly increasing if f'(x) >0 for alla € I, 
(3) f ts decreasing if f'(a) <0 for alla € I, and 
(4) f is strictly decreasing if f’(x) <0 for alla € I. 


35 


2.3. DIFFERENTIATION 27 


Multivariable functions. A full exploration of multivariable calculus (differ- 
entiation and integration) requires a course like Math32A or Math131B. For our 
purposes, we will need to know a few things about partial derivatives: 


Definition 2.3.11. Suppose F : D > R is a function with nice domain D C R? (so 
F = F(t,y) is a two-variable function). Let (to, yo) € D be a fixed point. We define 
the partial derivative of F with respect to ¢ at (to, yo) to be the following 
limit, if it exists and is finite: 

OF _ (to +t, yo) — F (to, yo) 

5p (to Yo) ‘= lim t 
and we define the partial derivative of F with respect to y at (tg, yo) to be 
the following limit, if it exists and is finite: 

OF F(to, yo + y) — F(to, yo) 


By (toro) = bay i 


In practice, a partial derivative is the same thing as a single-variable derivative 
where you treat the other variable as a constant. In particular, all of the rules from 
the preceding subsection apply to partial derivatives when you view them this way 
(product rule, chain rule, etc.). 


Definition 2.3.12. Suppose D C R? is a nice subset of R?, and F: D> Risa 
two-variable function. We say that: 
(1) F has first-order partial derivatives if at every point (to, yo) € D, the 
partial derivatives 


OF OF 
yp (10: Yo) and By (tor vo) 


exist and are finite; 
(2) F has second-order partial derivatives if: 
(i) F' has first-order partial derivatives, and 
(ii) the functions a & : D — R also have first order partial deriva- 
tives. 
(3) F has continuous second-order partial derivatives if: 
(a) F has second-order derivatives, and 
(b) each of the functions: 


OF OF PF OF 
Ot?’ OtOy’ Oydt’ Dy? © 


are continuous. 


+R 


In general, all of the two-variable functions we’ll consider have continuous partial 
derivatives of all orders, including first and second order, at least wherever they 
are defined. In this case, the following theorem tells us that the “mixed” second 
order partial derivatives are the same. This will be useful for getting a checkable 
criterion for exactness in Section 


Clairaut-Schwarz Theorem 2.3.13 (Equality of mixed partial derivatives). Sup- 
pose F : D + R where D C R? is a nice subset of the plane R? has continuous 
second-order partial derivatives, 1.e., 

OF OF O° F OF 

Ot?’ = Oy?’ AtOy’——« Oydt 
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all exist and are continuous (the functions we'll deal with always satisfy this prop- 
erty). Then for all (to, yo) € D: 

aor gd 

~~ (t = 2. (b 

Oty | 05 Yo) agar | 0, Yo) 


2.4. Integration 
Definite integrals. When it comes to integration, the most fundamental no- 
tion is to define the following: given a function f : [a,b] > R, what does it mean 


for the function f to be integrable on [a,b] and how do you define f f(t) dt if this 
integral is to exist? We will not dive into this question and instead assume you 
have a working understanding of what this means to you. In particular, we define: 


Definition 2.4.1. Suppose a < b € R. We say that the function f : [a,b] > R is 
integrable if the definite integral 
b 
[float 


exists and is finite (i.e., it equals a real number from R). If f : [a,b] > R is 
integrable, then we also define: 


[ soa = - [soa 


Given any function g: D— R and aE R, we define: 


[soe = 0 


Here are some basic facts about what types of functions are integrable: 


Fact 2.4.2. Suppose f : [a,b] > R is a function. Then: 
(1) if f is continuous, then f is integrable, 
) if f is piecewise continuous, then f is integrable, 
(3) if f is integrable and f : [a,b] > R is a function such that the set: 
{x € [a,b]: f(x) # f(x)} 


is finite, then f is also integrable and 


"pdt = “leat 
[ toa = f 


Fact |2.4.2| tells us that basically every function f : [a,b] > R we come across in 
this class will be integrable. Furthermore, |2.4.2{[3) tells us that as far as computing 
integrals are concerned, we can safely change finitely many values of the function 
and still arrive at the same answer (for instance, if you are integrating a step 
function and yow’re not sure about the values at the endpoints). 


The following law for computing definite integrals is used all the time: 


Lemma 2.4.3 (Linearity of Integration). Let f,g: [a,b] > R be integrable func- 
tions, and let-a eR. Then 

(1) af: [a,b] > R ts integrable, and r af (t) dt = af’ f(t) dt, 

(2) f+g: [a,b] > R is integrable, and LF+9@) dt = A f(t) dt+ f° g(t) dt. 
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The following is also very useful, especially if the behavior of a function changes on 
different intervals: 


Lemma 2.4.4 (Additivity over intervals). Suppose f: [a,b] > R is a function and 
c€ (a,b). Then f is integrable on [a,b] iff f is integrable on [a,c] and [c, b]. In this 


Hise ie hate 
[50 die [ 10 a+ [10 dt. 


The following two theorems tell us that integration and differentiation are inverse 
operations, which is what makes integration so useful when it comes to solving 
differential equations. First a definition: 


Definition 2.4.5. Suppose f : D > R is a continuous function with a nice domain 
DCR. A function F’': D > R is called an antiderivative of f if: 

(i) F is differentiable, and 

(ii) for every t € D, F’(t) = f(t). 


The so-called first fundamental theorem of calculus provides us a method of comput- 
ing the exact value of the definite integral of a function provided we have available 
to us an antiderivative of that function: 


First Fundamental Theorem of Calculus 2.4.6. Suppose f : [a,b] > R is a 
continuous function on [a,b] and differentiable on (a,b). Then: 


b 
/ fat = f(a) — FO. 


The so-called second fundamental theorem of calculus provides us a method of using 
definite integrals to construct an antiderivative of a continuous function: 


Second Fundamental Theorem of Calculus 2.4.7. Suppose f : D— R is a 
continuous function with a nice domain D CR, and fiz typ € D. Let IC D be the 
largest interval such that tp € I. Consider the function F : I + R defined by 


F(t) := i f(s) ds 


for everyt EI. Then 


(1) F is differentiable on I, and 
(2) F'(t) = f(t) for every t € I, i.e, F is an antiderivative of f on the 
interval I. 


Indefinite integrals. When we later determine the general solution of a dif- 
ferential equation, we need to be able to find (and parametrize) all solutions of 
the differential equation, not just a particular one. In terms of antiderivatives, this 
means we need to be able to find (and parametrize) all antiderivatives of a partic- 
ular function, not just one antiderivative. This is taken care of by the notion of 
indefinite integral: 


Definition 2.4.8. Suppose f : D > R is a continuous function with a nice domain 
DCR. The indefinite integral of f is an infinite family of functions: 


F(t;C) = F(t)+C 
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where C' € Rand Ff: D > R is a particular antiderivative of f. This situation is 
often denoted by writing: 


[rae — FW) +0. 


Remark 2.4.9. Technically speaking, the indefinite integral of f really should 
be the family of all antiderivatives of f. In particular, each so-called connected 
component of the domain of f requires its own constant of integration. For instance, 
for the function f(t) = 1/t viewed as a function (—co,0) U (0,+00) — R, the 
indefinite integral really should be: 


[f- In(t)+C, ift>0 
é  |in(-t)+C, ift<0 


where C1, C2 € R could be the same number, or could be different. Simply writing: 


dt 


does not actually give us every possible antiderivative of 1/t on the domain (—oo, 0)U 
(0, +oo) because it requires us to use the same constant of integration on both “con- 
nected components” (—oo,0) and (0,-+00). This is a very minor issue which we are 
happy to ignore since the particular solutions to initial value problems (which we 
hope to be unique) will have intervals as their domain. 


We also have the second fundamental theorem of calculus for indefinite integrals: 


Second Fundamental Theorem of Calculus 2.4.10 (Indefinite version). Sup- 
pose f: DR is a continuous function with a nice domain DCR. Then 


d 
a t)dt = t). 
5 f tat = £0 
Theorem |2.4.10}is to be interpreted as: for every antiderivative F(t) + C of f(t), 


d 
q+) = f(t). 
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First-order differential equations 


3.1. Implicit differential equations 


In this course we will be primarily concerned with first-order differential equations, 
as well as higher-order linear differential equations. This begs the question: 


What is a differential equation and what is the order of a differential equation? 


We will answer this question by first giving a very general definition of differential 
equation which will encompass nearly all differential equations we will encounter in 
this Chapter and in Chapter [4] 


Definition 3.1.1. An implicit differential equation (of order r) is an equation 
which can be written in the form 


(1) F(t,yyy",--.y) = 0 


where F is a real-valued function of r + 2 variables. The order is the order r of 
the highest derivative y of y which appears in the equation. 

A solution to ({) is a function y : J + R (where I C R is an interval) which is 
differentiable at least r times such that 


F(t,y(t),y'(t),---,y(Q)) = 0 for every t € J, 


i.e., for every t € I, when you plug t, y(t), y/(t),..., y(t) into the function F the 
output is zero. 


We now give some examples of implicit differential equations and some of their 
solutions, in increasing order of order. 


Zeroth order. Here is an implicit differential equation of order 0: 


(3.1) y? + 2y* + 3y? + 4y? + 5y+6 = 0 
Given a solution a € R of the polynomial equation 
Kp ox 4 Be? Lax tox 46 = 0, 


the function y : R > R defined by y(t) := a for all t € R (i-e., the function with 
constant value a) is a solution of (3-1). This example should convince you that the 
subject of differential equations already encompasses all of one- and two-variable 
polynomial equations. In particular, we shouldn’t get our hopes up that we will be 
able to solve too many higher-order differential equations in general. 
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First order. We will give two examples of a first-order differential equation. 
The first one takes full advantage of the implicit part of the definition: 


Example 3.1.2 (Clairaut). The differential equation: 
(3.2) y—ty'+expy’ = 0 
Every solution y : R > R of has the form 

y(t) = Ct+expC 
where C € R is some fixed constant. Note that even though is complicated, it 
is actually pretty easy to check that the given solution is actually correct. Indeed, 
first compute the derivative of y: 

y(t) = C 
and then plug t, y(t), y’(t) into and notice that everything cancels out: 
y(t) — ty’(t) +expy/(t) = Ct+expC—tC+expC = 0. 
This illustrates another important lesson: 
Checking that a given function is/is not a solution to a 


differential equation is usually easy, even if the given 
differential equation is hard/impossible. 

Indeed, it is simply a matter of computing r derivatives and then plugging them 
into the equation and seeing if everything cancels out. Of course, we will be more 
interested in solving differential equations than checking whether a candidate solu- 
tion is correct or not. However, it is reassuring to know that at least one direction 
of the process is fairly easy. 

The next differential equation is a more typical example of a differential equation 
which we will study: 


Example 3.1.3 (Logistic equation). Let b,c > 0 be fixed positive constants. Then 
the logistic equation is the differential equation: 
y’—y(b—cy) = 0 
For every nonzero constant C' € R \ {0} we have a solution y : R — R defined by: 
b 1 
f) = -; 
y(t) c 1+Cexp(—dt) 


Furthermore, the constant functions y = 0 and y = b/c are also solutions. (Exercise: 
check this!) We will study the logistic equation in more detail later, including how 
to derive these solutions. 


Second order. Here is a typical example of a second-order differential equa- 
tion we will study: 


(3.3) y” —3y'+2y = 0 
Every solution y : R > R of (3.3) is of the form: 
y(t) = Cyexp2t+C expt 


where C,C2 © R are arbitrary constants. Generally speaking, for second-order 
differential equations there will be two constants of integration we need to find. 
This reflects the fact that the equation involves a first and second derivative (so 
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somewhere we are doing two integrals, each one with its own constant of integra- 
tion). Equation is an example of a second-order linear differential equation 
with constant coefficients, which will be one of the main equations of interest in 
Chapter [4] 


3.2. Differential equations in normal form 


Definition casts a very wide net. In general most differential equations we will 
encounter can be put into a slightly simpler form: normal form. 


Definition 3.2.1. A differential equation of order r in normal form (or an 
explicit differential equation of order r) is a differential equation which can 
be written in the form 


(t) y = F(t,yy'y”,---,.yo-?) 


where F is a real-valued function of r+ 1 variables. A solution of ({) is a function 
y : I + R (where I C R is an interval) which is at least r times differentiable, such 
that for every t € I: 


yO) = F(t.y@),¥@,--..¥°@) 
In other words, an implicit differential equation of order r can be put into normal 
form if it is possible to solve for the highest derivative y‘") in terms of the lower 
derivative y,y',...,y—) and t. 


Example 3.2.2. (1) A zeroth-order differential equation in normal form is 
an equation of the form: 


y= FE) 
Clearly, the function y(t) := F(t) is a solution. We will never be interested 
in explicit zeroth-order differential equations. 
(2) A first-order differential equation in normal form is an equation of the 
form: 


yo = F(ty) 
The logistic equation from Example can be put into normal form: 


/ 


yo = y(b—cy) 
It is not clear whether the equation from Example 
y—ty'+expy’ = 0 


can be put into normal form since this would involve solving for y’. In 
general, for the equations we deal with there will be no issue with rewriting 
them in normal form. 
(3) A second-order differential equation in normal form is an equation of the 
form: 
y" = F(t,yy’). 
Equation can be written in normal form: 


y" = 3y'—2y 


This concludes our discussion of general-order differential equations. For the rest 
of the chapter we will focus on first-order differential equations in normal form. 
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Explicit first-order differential equations. Recall that an explicit first- 
order differential equation is an equation which can be written in the form: 


(3.4) y = Fi(t,y) 
where F is a real-valued function of two variables. A solution to is a differ- 
entiable function y : I > R (I C R is an interval) such that for all ¢ € J, 


y(t) = F(t,y@) 
Solutions are also referred to as integral curves or solution curves, especially 


when we want to emphasize the geometric properties of the solution. 


We will often be interested in obtaining a specific solutions which passes through 
a given point (to, y(to)). The best way to do this is to first find all solutions of the 
differential equation, and then find the particular solution we are interested in. 


Definition 3.2.3. The general solution of (3.4) is a family] of functions y(t; C) 
which depends on a parameter C' € R such that: 


(1) for every valid parameter Co, the function y(t;Co) is a solution of (3.4), 
and 
(2) every solution of (3.4) is of the form y(t; C;) for some valid parameter C\. 


A particular solution is a function of the form y(t) = y(t;Co) for some fixed 
value Co. 


Example 3.2.4. Consider the differential equation 
(3.5) —a 
We wish to find the general solution to (3.5). Integrating both sides, we find that 
1 
y(t) = oul +C 
for some constant of integration C € R. We claim that the general solution is 
1 
y(t;C) = 5st + C 


where C' can be any real number. Indeed, for every specific Co € R, the function 


y(t) = $t? + Cp is a solution. Furthermore, if 9(t) is also a solution, then y(t) = t, 


and thus 
(u(t) — y(40))' = (g(t)- 57)’ = t-t = 0 
which shows that y(t) and y(t;0) differ by a constant. Thus there exists C, € R 


such that y(t) = y(t;C1). We conclude that y(t;C) is the general solution of (3.5). 
Here are some particular solutions: 


1 
y(t) = yG3) = 50 +3 
1 
y(t) = y(t;-10) = xt — 10. 


The problem of finding a specific particular solution will be formulated as an initial 
value problem: 


'The notation y(t; C) is meant to suggest that the function y(t) depends also on the parameter 
C. Each time you choose a specific value Co for C, then you get a particular solution y(t) := 
y(t; Co). 
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Definition 3.2.5. An initial value problem is a pair of two conditions: 
(i) a differential equation: 
y = Fit,y) 
(ii) a specific point which the solution must pass through: 


y(to) = Yo, 
where (to, yo) € R?. This is called the initial condition. 


Example 3.2.6. We wish to solve the following initial value problem: 
(i) y'=t 
(ii) y(3) =7 

We have already found that the general solution to (i) is 


i} 
y(t;C) = xt +0 
We will use (ii) to solve for the exact value of C: 
1 
y(3) = 7 = a8 Fe 


and so r F 
We conclude that the solution to the above initial value problem is: 


y(é) = y(65/2) = SP +3. 


Direction fields. One of the remarkable features of explicit first-order differ- 
ential equations is that, even if some of them might be difficult to solve, it is usually 
pretty easy to make a rough sketch of the general solutions. This is because the 
equation 


y = Fit,y) 
tells us what the derivative of the solution needs to be at each point (t,y) in the 
plane. We make this precise with the notion of a direction field. 
Definition 3.2.7. A direction field for the equation 
y = Fit,y) 
is a plot where at each point (tg, yo) you draw a tiny line segment with slope 


F (to, yo): 


Of course in practice when you (or a computer) draw a direction field, you can’t 
possibly draw such a line segment at every point in the plane (since there are 
infinitely many such points). Instead you draw enough tiny line segments (say, at 
integer or half-integer coordinates) in order to get a sense of the general behavior 
of the direction field. Once you have an accurate direction field, you can sketch an 
approximation of a solution by “following the direction of the direction field”. 


Example 3.2.8. Consider the logistic equation 


(3.6) y = y3—y) 
In Figure we plot the direction field for (3.6). We also include four solution 
curves corresponding to four different initial conditions. 
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FIGURE 3.1. Direction field for the logistic equation y’ = 
and several solution curves. 


We make the following observations: 
(1) At each point (to, yo), the slope only depends on yo. This is because 
y(3 — y) only depends on y and not on t. 
(2) This suggests that if y(t) is a solution to (3-6), then so is y(t+C) for any 
constant C’. 
(3) The direction field suggests that the constant functions 


y(t) = 0 and y(t) = 3 


are both solutions to (3.6). This is indeed the case, as can be easily 
verified. 

(4) There are many other non-constant solutions as well, we will learn how to 
solve for them in Section 


Of course, by merely plotting a direction field and sketching a solution curve, you are 
not actually solving the differential equation yet. However, this procedure provides 
valuable insight into the nature of the solutions which can be very fruitful. In some 
sense, this is the starting point for the qualitative study of differential equations. 


3.3. First-order linear differential equations 


We now arrive at the first family of differential equations which we will study in 
detail, the so-called first-order linear differential equations. 


Definition 3.3.1. A first-order linear differential equation is a differential 
equation which can be written in the form: 


y+ f(t)y = g(t) 
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where f,g are real-valued functions of the variable t. The function f(t) and g(t) 
are called?|the coefficient functions. 


As we shall see, solving a first-order linear differential equation really boils down 
to performing an integration. We will work up to the general case (where both f(t) 
and g(t) are nonzero functions) in several steps. 


Direct integration. Consider first the case where f(t) = 0 for all t. We call 
the resulting differential equation: 


a direct integration differential equation. This is because you can directly solve 
this differential equation by integrating g and, if need be, solving for C with the 
initial condition. Here is an example: 


Example 3.3.2. Consider the initial value problem: 


(i) y = vt, 
(ii) y(4) = 6. 


Integrating the differential equation we obtain 

y(t) = 2/30/27 +0. 
Using the initial condition we get 

y(4) = 6 = 2/3(4)9/? +0 

and so C = 6 — 16/3 = 2/3. So the solution to the above initial value problem is 

y(t) = 2/3t8/? + 2/3. 
In Figure [3.2| we plot the corresponding solution curve together with the direction 
field. Notice that the solution exists on the interval [0, +00), and this is the possible 
interval on which the solution can exist and remain a solution because g(t) = vt is 
only defined on [0, +00). 
We also remark that in Figure [3.2| we see that the direction field only depends on 
t and not on y. This observation allows us to guess (if we didn’t know it already) 


that any two solutions of (i) differ by a vertical translation (i.e., adding a constant). 
This indeed is also the case for general direct integration differential equations. 


2sometimes just f(t) is called the coefficient function and g(t) is called the forcing func- 


tion. 
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FIGURE 3.2. Direction field for the equation y’ = Vt and the 
solution curve passing through the point (4,6). 


Theorem 3.3.3 (Direct Integration). Suppose g: D > R is a continuous function 
with nice domain D CR. Consider the differential equation: 
(i) y' = g(t) 


(1) The general solution of (i) is given by 


y(t) = y(t;C) = [amare 


Furthermore, suppose we are also given an initial condition 
(it) y(to) = yo, where tp € D and yw ER. 
(2) Then the initial value problem (i)+(ii) has the unique solution: 


y(t) = i o(8) de + yo 


to 
(3) The interval of existence of this solution (i.e., the largest interval con- 
taining to for which this function remains a solution) is the largest interval 
I CR such that: 
(a) to € I, and 
(b) ICD. 


The homogeneous case. We next consider the case where g(t) is the constant 
zero function and f(t) is possibly nonzero. 


Definition 3.3.4. A first-order linear differential equation is said to be homoge- 
nous if it is of the form: 


y+ f(y = 0. 
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Solving the homogeneous case requires knowing a trick: multiplication by a so-called 
integrating factor. We illustrate this first with an example: 
Example 3.3.5. Consider the homogeneous first-order linear differential equation: 
1 
(3.7) y+ 7 = 0 


Here we are regarding the coefficient function 1/t to have domain (—oo, 0)U(0, +00). 
First observe that if u(t) is any function which is never zero, then the differential 
equation 


t 
has the same solutions as equation (3.7). We will use the following choice of ju(t): 


dt 
p(t) := exp ¢ 4 = expln|t| = |é| 


where the domain of u(t) is also (—00,0)U (0, +00). Then we multiply the lefthand 
side of (3.7) by y(t) to obtain: 


i 
|¢| (v + zu) = |t\y/ +sen(t)y = (|tly)’ = 0. 


p(t) (v + zu) = 0 


In other words, multiplying through by the integrating factor p(t) allows us to view 
the lefthand side as the derivative of a single function of t. Next we integrate both 
sides of 


(|tly)’ = 0 

to obtain 
ltly(t) = C, 

or rather, 
C 
yt) = 7. 
l¢| 


Here the function y(t) also has domain (—oo, 0) U (0, +00). 
Here is how to handle the general homogeneous case: 


Theorem 3.3.6. Suppose f : D> R is a continuous function with nice domain 
DCR consider the differential equation: 


(i) y+ fy =0 
(1) Define the integrating factor to be the function y: D— R given by: 


uit) = ex ([ sear) 


(here [ f(t) dt can be any antiderivative of f(t), the constant of integration 
does not matter). Then we can multiply (i) by w to obtain: 


w(t)(y' + f(y) = (u(é)y)’ = 0. 
(2) The general solution of (i) is given by: 


y(t) = y(C) = ra = Cexp (- [ 1a) 


Furthermore, suppose we are also given an initial condition 
(it) y(to) = yo, where tp € D and y ER. 
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(3) Then the initial value problem (i)+(ii) has the unique solution: 
t 

u(t) = wexn(— f s(s)ds) = 2 

to j(t) 


where p(t) := exp( fj, f(s) ds). 
(4) The interval of existence of this solution is the largest interval I C R such 
that: 
(a) to € I, and 
(b) ICD. 


The general case. The general first-order linear case contains both the direct 
integration case and the homogeneous case. The trick with the integrating factor 
also works for the general case. We give an example first: 


Example 3.3.7. Consider the first-order linear differential equation: 
(3.8) y’ +sin(t)y = sin?t 
The first thing to do is to compute the integrating factor: 


p(t) = exp (/ sintdt) = exp(— cost) 
Next we multiply both sides of by y(€) to obtain: 


u(t)(y’ +sin(t)y) = (exp(—cost)y)’ = sin? texp(— cost) 
Integrating both sides yields: 


exp(—cost)y(t) = ‘| sin? texp(— cost) dt = —4exp(— cost) cos*(t/2) +C 
Solving for y(t) gives us the general solution: 
y(t) = —4cos*(t/2) + C exp cost 
The general case works much the same way: 


Theorem 3.3.8. Suppose f: D— R andg: E—- R are continuous functions with 
nice domains D, E C R and consider the differential equation 


(i) y+ fy = g(t) 
(1) Define the integrating factor to be the function uy: D— R given by: 


uit) = exp ([ sar) 


(here [ f(t) dt can be any antiderivative of f(t), the constant of integration 
does not matter). Then we can multiply (i) by w to obtain: 


w(t)(y' + F(t)y) = (u(t)y)’ = n@)g(t). 
(2) Then general solution of (i) is then given by: 
C 


1 
uit) = vic) = > f nltglt)dt-+ 
u(t) L(t) 
Furthermore, suppose we are also given an initial condition 


(ii) y(to) = yo, where tp) €E DN E and yo € R. 
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(3) Then the initial value problem (i)+(ii) has the unique solution: 


where u(t) := exp( fj, f(s) ds). 
(4) The interval of existence of this solution is the largest interval I C R such 
that: 
(a) to € I, 
(b) IC D, and 
(co) ICE. 


ProoF. (1) First we will justify the key property of the integrating factor: 


u(t)(y’ + f(y) = (u(t)y) 


/ 


Note that: 


(u(t)y)’ = p(t)y’+yu'(t)y by the product rule [2.3.4][2) 


= p(t)y’ + “ lexp (/ f(t) ax)| y 


p(t)y’ + exp (/ f(t) ar) “ / f(t) a y by the Chain Rule [2.3.6] 
= p(t)y’+p(t)f(y by Theorem [2.4.10 
= p(t)(y’ + f(t)y) 


(2 part 1) Next, we will check that for every C € R, the function y(t;C) is 
a solution. Since p(t) is a function which is everywhere nonzero, it follows that 
y(t; C) is a solution of 


y+ f(t)y = g(t) 
if and only if y(t; C) is a solution of 
(t) n(t)(y + Fy) = Ht)9(t). 
We will verify that y(t; C) is indeed a solution of (+). Note that: 


u(t)(y/(tsC) + f(y(tsC)) = (u(t)y(tsC))" by (1) 


(f wasn +e) 
y(t)g(t) by Theorem [2.4.10] 


This verifies part (1) of Definition We will return to verifying part (2) of the 
definition later. 
(3 part 1) We now verify that 


l 


y(t) = if y(s)9(s) ds + Y 


u(t) to p(t) 


is a solution to the initial value problem (i)+(ii). It is clear that y(t) is a solution 
to (i) since it is a particular instance of the general solution in (2). To verify (ii), 
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(to) = exp (a f(s )ds) 


exp(0) by Definition [2.4.1] 
= 1. 


we notice first that: 


Next, we observe: 


Leger as 2 
ulto) = Tay ff mlsdals) ds + 


to 
= | uls)g(s)as-+ yo 
= 0+y0_ by Definition 


= Yo- 


Thus y(t) is a solution to the initial value problem (i)+(ii). We will prove uniqueness 
below. 

(4) First observe that the interval J C D is the largest possible interval which 
contains tg which we could hope to have as the domain of the solution. This is 
because the differential equation (i) is only defined on the set DN E (the on which 
both coefficient functions f and g are defined). 

(2 part 2) and (3 part 2) are taken care of by the next lemma. 


Lemma 3.3.9. Suppose f : D— R andg: E — R are continuous functions 

with nice domains D,E CR. Suppose that yo,y1 : I - R are two differentiable 

functions such that: 

(a) I CR ts an interval contained in both D and E, 

(b) fori = 0,1, y(t) + f(t)yi(t) = g(t) for every t € I, t.e., yo and y, are both 
solutions to the differential equation: 


y+ f®y = g(t) 


Then: 
(1) there exists a constant C € R such that for every t € I, 
C 
yo(t) = yi(t) + —~ 
@) = n+ 


where u(t) = exp({ f(t) 
(2) tase if there is € I such that yo(to) = yi(to), then C = 0 and 


so for every t € I, yo(t) = y(t). 
ProoF. It follows from (b) that for every ¢ € J, 
(yo — y1)'(t) + F(E)\(Yo — wi) (t) = 0. 
Multiplying both sides by p(t) yields for every t € I: 
H(t) ((yo — y1)'(t) + F()(Yo — m)(L)) = 0 


which we can rewrite as: 
(H(t)(yo — w(t)’ = 0 
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for every t € I. Since J is an interval, by Corollary there is a constant C € R 
such that for every t € I: 


u(t)(yo — yr)(t) = C. 

Thus for every ¢ € I, 
() = wi)+—— 
i“ OFT 


This establishes (1). For (2), suppose there is to € I such that yo(to) = y1(to). 
Plugging in to into the above equation then yields: 


yo(to) = yr(to) + Tay 


which simplifies to 
C 


p(to) 
This gives us C = 0. In particular, for every t € I, we have 


yo(t) = yi(t). 


0 = 


This establishes (2). 


Remark about absolute values in the integrating factor. In this sub- 
section we make a few remarks about the role of absolute values in the integrating 
factor u(t) which appears when computing a solution of a first-order linear differ- 
ential equation. We begin with a soft rule-of-thumb: 


Rule of Thumb 3.3.10. If there are absolute values which arise in 


uit) = ex (f seat) 


as a result of an expression In|---| arising in [ f(t) dt, then these absolute values 
can be safely removed in the final expression for p(t). 


TLDR EXPLANATION. Suppose we are looking at the first-order linear differential 
equation: 

y+f(ty = g(t) 
The only relevant property that we need an integrating factor p(t) to satisfy is that 
it simplifies the lefthand side: 


(t) w(t)(y' + f(y) = (H(t)y) 
However, if ju(t) satisfies ({), then so does —j(t): 
—n(t)(y! + Fy) = (— wey)’ 


since this amounts to multiplying (ft) through by —1. Now suppose that p(t) = 
|u(t)| for some differentiable function u(t). Then by definition, 


— jut) ifu(t) >0 
AO u(t) if —u(t) <0 


/ 


The claim is that the function u(t) (i-e., 4 without the absolute values) can serve 
as an integrating factor. This is essentially because: 


_ jut)  if'u(t)>0 
ue) ‘ern if u(t) <0 
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Since both y(t) and —y(t) work perfectly well as integrating factors, it follows that 
in all cases, the function u(t) works as an integrating factor. 


We hesitate to call a “Fact” or “Theorem” because this would require a 
complete investigation into all possible ways that an absolute value could show up 
in a formula for an antiderivative of an elementary function. However, we will give 
a justification as to why dropping absolute value signs is allowed and what we are 
actually doing to the integrating factor when we do drop the absolute value signs. 
For this discussion, we first make more precise what we mean by an integrating 
factor: 


Definition 3.3.11. Suppose f : D > Risa continuous function with a nice domain 
DCRandI C Disa nice subset of D. We call a differentiable function u: J > R 
an integrating factor for y’ + fy on J if: 

(1) p(t) 4 0 for every ¢ € J, and 

(2) for every differentiable function y : I > R, the following equality holds: 


y(t)(y/(t) + F(Ey(t)) = (u(t)y(t))’ 


for every t € I. 


Certainly, the integrating factors we’ve been using: 


uit) = ex (f seat) 


satisfy the definition of an integrating factor according to Definition|3.3.11| But an 
integrating factor is by no means unique. Indeed, we are free to multiply an inte- 
grating factor by any nonzero constant and it remains a perfectly valid integrating 
factor: 


Observation 3.3.12. Suppose f : D > R is a continuous function with a nice 
domain D CR, I C D is a nice subset of D, and wp: I > R is an integrating 
factor for y’ + fy on I. Then for any nonzero constant a € R (a £0), the function 
ap: I — R is also an integrating factor for y' + fy on I. 


However, we have a little bit more freedom in modifying our integrating factors than 
just multiplying everything through by nonzero constants. For instance, consider 
the differential equation: 


1 
e+ og = 0 


We find that an integrating factor is y(t) = exp(f dt/t) = |t|. However, 
claims that we can switch to using ji(t) =t as an integrating factor. The modifica- 
tion from u(t) to f(t) is more involved than just scaling y(t) by a nonzero constant. 
First, note that in this example, f(t) = 1/t and so f : (—oo,0) U(0, +00) — R does 
not have 0 in its domain, so we are also considering p(t) = |t| also to be a function 
pt: (—co, 0) U (0, +00) > R without zero in its domain. Furthermore, note that: 


2 bMS os es £ ESO 
— n => 
. af HES e  HeZ0 


In other words, to change p(t) into ji(t), we had to multiply u(t) by —1 on the 
(—oo, 0) portion of its domain, and keep p(t) the same on the (0,+00) portion of 
its domain. The reason this type of “selective” multiplication of ju(t) is allowed is 
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because (—oo,0) and (0, +00) are not connected to each other, so we don’t have to 
worry about the portion of fj on (—co,0) joining up nicely with the portion of / on 
(0,-+oo). This is an instance of the following general observation: 


Observation 3.3.13. Suppose f : D > R is a continuous function with a nice 
domain D CR, and suppose uw: D > R is an integrating factor for y' + fy on D. 
Furthermore: 

(1) Suppose the domain D = I, UIgUI3U--- is a union of disconnected 
intervals I, (i.e., there isnoi#j anda <be€R such that [a,b] C 1,UI;), 
and 

(2) Suppose 01, Q2,03,... is a sequence of nonzero constants from R. 


Then the function : D— R defined by: 

ji(t) := agu(t) afte I, 
is also an integrating factor for y’ + fy on D. 
We now arrive at a more precise version of|3.3.10 


Observation 3.3.14. Suppose f : D > R is a continuous function with a nice 
domain D CR, and suppose 


u(t) := exp (/ sea) = |u(t)| for every t € D 


where u: D—- R is some differentiable function. Then: 
(1) for every t € D, u(t) £0, 
(2) the sets, 


D, := {t€D:u(t)>0} and Dy := {te D: u(t) <0} 


are disconnected and D = D, U Do, and thus 
(3) the function fu: D > R defined by 


ji(t) = u(t) 
for every t € D is also an integrating factor of y' + fy. 


JUSTIFICATION. (1) is clear because p(t) is defined as an exponential of a certain 
function, and exp never takes the value zero. 

(2) Suppose towards a contradiction that there is an interval [a,b] C D such 
that a € D,; and b € Dp (the other case is similar). Then since u : [a,b] > R 
is differentiable, and hence continuous, by the Intermediate Value Theorem [2.2.6] 
there is y € (a, 6) such that u(y) = 0. This contradicts (1). Thus D, and Dz are 
disconnected. The claim that D = D, U Dy also follows from (1). 

(3) is an application of Observation In order to obtain ji from py, on 
every interval J C D,, we can keep ys the same, and on every interval J C D2, we 
can multiply pp by —1. 


Remark 3.3.15. In general, you only need to worry about absolute value signs 
(and whether to drop them) when computing the general solution of a first-order 
linear differential equation. For an initial value problem, you use the precise inte- 


grating factor: 
t 
it) = exw (f s0)4s) 
to 


54 


46 3. FIRST-ORDER DIFFERENTIAL EQUATIONS 


where to,t are both included in the same interval in the domain of f. Since your 
attention is restricted to this interval, the context should tell you, when faced with 
|u(t)|, whether to treat this as u(t) or —u(t) (depending on whether u(to) > 0 or 
u(to) <0); only one of them can happen on an interval in the domain of f which 
contains to. 


We now give a very carefully worked out example, where we show how to apply the 
above discussion on absolute values. In general, when you are doing computations, 
you are free to drop absolute values in this context without justification provided 
that you still get the full correct answer. 


Example 3.3.16. Consider the following initial value problem: 

(1) y’ + tan(t)y = sec(t) 

(2) y(0) = 5. 
Find the general solution to (i) and the particular solution to (i)+(ii). 
SOLUTION. First notice that the domain of f(t) = tan(t) and g(t) = sec(t) is 


TT 


2 


D := domain(tant) = domain(sect) = Wt 
keZ 


7 
+k, 5 1 w(k 4 \) 


i.e., the domain is all of R except points of the form 7/2 + 7k, where k € Z. Next 
we compute the usual integrating factor: 


p(t) := exp ( / tancat) = expln|sect| = |sec¢|. 


The domain of u(t) is the same as the domain of tant and sect above (= D). 
Furthermore, note that 


TT TT 
Di = {teDisect>0} = (J (S4mk,S+m(k +1) 
k€Z,k odd 
TT TT 
Dy := {t€ D:sect <0} = ne (F +nk, 5 + (b+) 


As we see, the intervals in D, are not connected to the intervals in Dz. Thus we 
can define 2: D — R by 


t ifte D 
p(t) := a ‘ pee a sect 
—p(t) ifte Do 


for every t € D. By Observation|3.3.13| we know that fi(t) = sect also works as an 
integrating factor, so we will use that instead. Continuing on with the problem, we 
multiply (i) through by / to obtain: 


(sec(t)y)’ = sect 
Integrating both sides yields: 


sec(t)y = tant+C 
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where C’ € R is an arbitrary constant. Thus the general solution] is: 


tant+C 
SS de aa amare 
on the domain D. 
Next, we will solve the initial value problem (i)+(ii) from scratch. Since to = 0, 
we see that the interval of existence of the solution will be (—7/2,7/2), so we can 
restrict our attention to this interval. First we compute the integrating factor 


(where t € (—1/2,7/2)): 
= (6 (/ ae 


[ils s| 
( ) ( 


= exp (Insect — Insec 0) 


= exp 


= exp | Insecs 


(1 
= exp (Insect — In1) 
= exp (Insect) 

= sect 


where in step (*) we removed the absolute value signs because sec s is positive at 
s = 0 (if the initial condition had to = 7 for instance, then we would have to replace 
In| sec s| with In(— secs) in that step). Now that we have the integrating factor, 
we can proceed with the particular solution (which is only defined on the interval 
of existence (—7/2,7/2)): 

t 


1 5 
y(t) = sec? sds +——__ because yp = 5 
sect Jo sect 
_ tant 5 
~ sect sect 
_ tant+ 5 
= sect ~ 


Mixing problems. We now discuss a practical application of first-order linear 
differential equations, the so-called mixing problems. We will introduce mixing 
problems with an example from [I] and an example from [2]. All mixing problems 
basically follow the same general outline, although the differential equations which 
show up might vary. 


Example 3.3.17 (Constant volume example). Suppose a tank contains 10L of 
brine solution (salt dissolved in water). Assume the initial concentration of salt 
is 100g/L. Another brine solution flows into the tank at a rate of 3L/min with a 
concentration of 400g/L. Suppose the mixture is well stirred and flows out of the 
tank at a rate of 3L/min. Let y(t) denote the amount of salt in the tank at time ¢. 
Find y(t). 

3 Technically speaking, the general solution would have a possible different constant +C on 


each connected component (7/2 + k7,n/e + (k + 1)7) of the domain, however we are sweeping 
this point under the rug. See Remark 
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SOLUTION. We are interested in solving for 
y(t) = amount of salt, units: g. 


We will determine the function y(¢) by setting up and solving a differential equation 
for y’(t): 
y(t) = rate of change in amount of salt, units: g/min 


The main equation we will use is the so-called balance law: 
y(t) = rate in — rate out 


Note that y(t), the “rate in” and “rate out” all have units g/min, whereas the 
information given in the question has units of either g/L or L/min. Thus we will 
need to use the following dimensional analysis: 


amount of salt volume of brine amount of salt 


= x : 
unit of time unit of time volume of brine 


We now will determine the “rate in” and “rate out”: 
Rate in: The brine flows in at a rate of 3L/min with a fixed concentration of 
400g/L. Thus the rate in of salt is: 


rate in = 3L/min x 400g/L = 1200g/min. 


Rate out: The brine flows out at a rate of 3L/min. The concentration of the 
brine in the tank changes, however, depending on the value of y(t). Since the tank 
contains a constant volume of brine, the concentration at time t in the tank is 


t 
concentration in tank = WO ap 
and thus the rate out is: 
t 3y(t 
rate out = 3L/min x 0) oi — 5H) gin 


IVP: We conclude that the differential equation that y satisfies is: 
3 
‘(t) = 1200- —y(t 
y'(t) 00 — soul) 
which we recognize as a first-order linear differential equation: 
74 are 1200 
ie 2 Ses 


Furthermore, at time t = 0, we know that y(0) = 100g/L x 10L = 1000g. To 
summarize, we need to solve the IVP: 

(i) y’ + Sy = 1200, 

(ii) y(O) = 1000. 
Using the usual method, we find that the solution is: 


y(t) = 4000 — 3000e~3*/1° 


where the units of y(t) is g (grams). 


Here is a similar example, except that in this example, the volume of solution in 
the tank changes, as a result of an imbalance between the rate in and rate out: 
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Example 3.3.18 (Nonconstant volume example). Suppose a 600L tank is filled 
with 300L of pure water at time t = 0. A spigot is opened above the tank and a 
brine solution with concentration 1.5g/L begins flowing into the tank at a rate of 
3L/min. Simultaneously, a drain is opened at the bottom of the tank allowing the 
solution to leave the tank at a rate of 1L/min. What will be the salt content in the 
tank at the precise moment that the volume of solution in the tank is equal to the 
tank’s capacity (=600L)? 


SOLUTION. We need to perform a similar analysis as in Example|3.3.17|to get the 
function y(t), but we also need to know at what time try is the volume of solution 
in the tank equal to 600L. Let V(t) be the volume in the tank (in units of L). Then 
the change in volume is also governed by a balance law: 
V'(t) = rate in—rate out = 3g/min—1g/min = 2g/min 
and thus 
V(t) = 2t+C. 

Since V(0) = 300, we get that C = 300 and so V(t) = 2t + 300. This allows us to 
determine the time try, at which the tank is full: 

600 = V(tran) = 2teun + 300 


and thus tg, = 150min (so the tank will be full at the 3-hour mark). 
Next we determine the function y(t), again using the balance law: 
y(t) = rate in — rate out 


Rate in: We are given that the solution which flows in has a rate of 3L/min, 
and a constant concentration of 400g/L. Thus: 


miei = ee 8 ee a 
n 


So the rate in of salt is constant. 

Rate out: We are given that the solution flows out at a constant rate of 
1L/min. The concentration in the tank, however, depends on the amount of salt in 
the tank y(t), as well as the volume of solution in the tank V(t). Thus: 


rate out = volume rate out x concentration in tank 


Lys _ y(t) g 


min V(t)L 2t+300L 
Thus our differential equation for y(t) is: 


ae 8 
2t + 300 

and our initial value is y(0) = 0 (since the tank starts with pure water, with no 

salt). This is a first-order linear differential equation. The solution is: 


yo = 4.5 


4500/3 
t) = 450 +3¢ 
y(t) 300 + 28 
And thus the salt content at tp = 150 is: 
4500/3 
150) = 450+3-150 —- —_—""___ = 582¢. 
yen) 300 + 2° 150 5 
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Variation of parameters. In this subsection we summarize an alternative 
method of solving a first-order linear differential equation, the method of variation 
of parameters. From a raw computational standpoint, this method requires you to 
compute the same integrals you otherwise would compute using the usual method, 
and for this reason we will not spend much time on it. However, it illustrates a 
certain idea in solving differential equations which we will encounter again: 


A solution to the homogeneous equation can be used 


to find a solution to the inhomogeneous equation. 


We will illustrate the method of variation of parameters first through an example, 
and then give some general statements. We will only look at finding the general 
solution, a particular solution to an IVP is found using the initial condition from 
the general solution in the usual way (solving for C). 


Example 3.3.19. Find the general solution to the following differential equation: 
(3.9) y +y = exp(t) 
SOLUTION. We will solve this using variation of parameters in multiple steps: 
Step 1: Get the general solution to the homogeneous equation: 
y ty = 0. 


For this we do the same thing as before, first compute the integrating factor: 


u(t) = exp ( / at) = exp(t) 


Now we multiply the differential equation through by p(t) to obtain: 


(exp(t)y)’ = 0 
and then integrate to get the homogeneous solution (which we call y,): 
exp(t)yn(t) = C 
and thus the general solution is: 
yn(t) = Cexp(—t) 

Step 2: Replace C with an unknown function, plug this into (i), and solve for 
the unknown function. 

Essentially, we will guess that the solution to y(t) = v(t) exp(—t), where v(t) is 
an unknown function we need to find. Since s(t) = exp(—t) is everywhere nonzero, 
every solution of (i) technically can be written in the form v(t) exp(—t) (i-e., if y(t) 
is a solution of (i), then v(t) := y(t) exp(t) works). If y(t) = v(t) exp(—t), then 
y'(t) = v'(t) exp(—t) — v(t) exp(—t). Plugging these things into (i) yields: 


y +y = exp(t) 
v(t) exp(—t) — v(t) exp(—t) + v(t) exp(—t) = exp(t) 
v!(t)exp(-t) = exp(t) 

u(t) = exp(2t). 


Solving for u(t) (by integrating), we get that 


io 5exp(24) + C. 
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Thus the general solution to (i) is 


y(t) = (SE +c) exp(—t) = 2d, + Cexp(—t). 


2 2 
Here are the steps in general for the method of variation of parameters: 


Variation of Parameters 3.3.20. Consider the first-order linear differential equa- 
tion: 


(3.10) y' + f(t) = g(t) 
with corresponding homogeneous equation: 
(3.11) y +f(t) = 0. 


The method of variation of parameters to solve consists of: 
(1) First find the solution y;,(t) to the homogeneous equation 


n(t) = exp (=f soat) = 


where ju(t) = exp(f f(t) dt) is the usual integrating constant. 
(2) Either substitute y = u(t)yn(t) into and solve for v(t), or else directly 


solve: 
vy - I) 


7 yn(t) 
with direct integration. The general solution will contain a constant of 
integration C. 
(3) Write down the general solution to 


y(t) = v(t)yn(t). 


Note that in steps (1) and (2) you are basically performing the same two integrations 
that you do in the usual method of solving first-order linear differential equations. 
Thus not much is gained from choosing to use variation of parameters, except 
perhaps another point of view. 


3.4. Implicit equations and differential forms 


In this section we recall some facts from calculus about implicit equations and intro- 
duce the auxiliary tool of differentials and differential forms. By way of motivation, 
recall that we are ultimately interested in this chapter in solving explicit differential 
equations: 
y a F(t,y) 
These equations can in general be much nastier than the first-order linear differential 
equations we studied in Section The reason is because in general the two- 
variable function F'(t,y) might entangle the variables t and y together in some 
more complicated way than just “—f(t)y + g(t)”. In calculus, we are used to most 
of the time y being an explicit function of t, i.e., y = h(t) for some one-variable 
function h. However, this is ultimately a very special case and rather restrictive. 
Consequently: 
We must abandon our desire for 


y to always be an explicit function of t. 


Instead, we will work with implicitly defined equations: 
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Definition 3.4.1. A implicit equation is a relation which can be written in the 
form: 

Fit,y) = 0 
where F is a function of twd4] variables. Given a two-variable function F(t, y) and 
a constant C’ € R, we call the implicit equation: 


Fli,y) = C 
a level set of F. 
Here is a very natural example of an implicit equation: 
Example 3.4.2 (Circles). Consider the function: 

F(ty) = P+y? 

Then for C € R, the level set: 

P4+y? = 0 
is the implicit equation which defines the circle of radius |C| in the ty-plane. If 
C #0, then the graph of t? + y? = C? is not a function since it fails the vertical 


line test. However, if we are interested in a certain point, say (/2/2, /2/2) on the 
circle t? + y? = 1, then we can obtain an explicit function: 


y(t) = V1-#, y:[-11J>R 


which passes through this point, and matches up with the top half of the full circle. 
If we are instead interested in the point (1,0), then we can instead look at the 


function: 
ty) = V1l—-y?, t:[-11])~R 


which passes through this point and matches up with the right half of the full circle. 
We illustrate this example in Figure [3.3] 


wi)=Vvl-e 


- | (in 


(A) Explicit equation (B) Implicit equation (Cc) Explicit equation 


FIGURE 3.3. Implicit equation versus explicit equations for a circle 


This illustrates in general how implicit equations work: implicit equations are not 
functions, but given a certain point (to, yo) on the equation, there will be some 
function y(t) or t(y) which passes through the point and satisfies the equation. 


4This definition generalizes to more than two variables, but we will restrict our attention to 
two variables in this section. 
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Question 3.4.3. We know how to compute the derivative of an explicit function 
y = f(t). The derivative is again an explicit function dy/dt = f'(t). How do you 
“take the derivative” of an implicit equation F(t, y) = 0, and what type of object is 
“the derivative” ? 


ANSWER. “The derivative” of an implicit equation F(t, y) = 0 is a brand new type 
of object, called a differential form: 


Definition 3.4.4. A differential form is a formal expression of the form: 


P(t, y) dt + Q(t, y) dy 


where P,Q are two-variable functions and dt and dy are meaningless placeholders 
associated to the variables t and y called differentials. Differential forms can be 
added together in the natural way, and you can multiply them (from the left) by 
arbitrary functions R(t, y). 


The right notion of “taking the derivative” here is to compute the differential 
of F(t, y): 


Definition 3.4.5. Given a two-variable function F(t,y), the differential of F 
(notation: dF’) is the differential form: 
dF := om (ty) dt + at y) dy 

Ultimately, we don’t have to fully understand what the differential really does or 
what a differential form really is. We just need to know how to use them for certain 
types of computations. For us differential forms will appear as transient objects 
which make our calculations easier (for instance, see Example (3.4.6), especially 
when working with implicit equations and general first-order explicit differential 
equations. If you like, you can think of the differential dF as a “storage device” 
which contains all the “derivative information” associated with F(t, y). 


We give an application of how you can use differential forms to compute implicit 
derivatives: 


Example 3.4.6 (Implicit derivatives). Consider the implicitly defined equation 
t?+y?—1 =0 (circle of radius 1 in the ty-plane) and the point (2/2, /2/2). What 
is the derivative dy/dt of the implicitly defined function at the point (vV/2/2, /2/2)? 


SOLUTION. One way to do this is to first notice that (/2/2, /2/2) lies on the upper 
half-circle, so it is a point on the graph of the explicit function: 


v(t) = V1-# 


Then we can compute: 


dy t 
V1-?? 
and then plug in t = 2/2: 


dy (V2\ — 2/2 | 
dt \ 2 a i 


This seems like an annoying way to answer this question because you have to: 
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(1) First, find an explicit function y(t) which goes through the point and 
agrees with the implicit equation. This can sometimes be very hard or 
impossible to do exactly. 

(2) Second, take the derivative of said explicit function. In our case, it also 
was annoying because we had to deal with the derivative of a square-root. 


Here is a better way to do it: 
First: Compute the differential of the equation t? + y? — 1 = 0. This will be: 


2tdt+2ydy = 0. 


Second: “Solve” for dy/dt: since 


2tdt + 2ydy = 0, 
we can subtract 2¢ dt from both sides: 
2ydy = —2tdt 
and then divide both sides by 2y and “divide” both sides by dt: 
dy 2% ¢ 
dt 2Qy y 
Third: Plug in the point of interest: 
dy — _ V2/2 _ 
dt ~~ 3/2 


Although we write “solve” and “divide”, we aren’t actually doing anything 
sketchy. Given a correct and careful definition of differential forms and differential 
(which we won’t go into), all of these steps are completely legitimate. Hopefully 
you are convinced that this is a much easier way to answer the question. 

Another benefit of the differential form is that it is in some sense “coordinate 
neutral”. For instance, suppose we asked a followup question: what is the derivative 
dt/dy at the point (1,0)? Then we could just take the differential and “solve” for 
dt/dy in the same way: 

dt y 


dy —sit 
and so at (1,0): 
dt 0 


— == = 0. 
dy 1 


In some sense, the general process we will learn for solving differential equation 
y’ = F(t,y) is just this process in reverse (with a few more complications). 


3.5. Separable and exact differential equations 


In this section we will see how to essentially do the process in Example in 
reverse, in order to solve an explicit first-order differential equation. As we will see, a 
much larger family of differential equations (beyond just the first-order linear ones) 
can be solved with this method. However, this method doesn’t always guarantee an 
exact solution because in the worst case it requires you to solve a partial differential 
equation (PDE) which can be hard or impossible to solve exactly. 
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Obtaining a differential form equation. Given an explicit first-order linear 
differential equation 


dy 
2, Fe 
(3.12) t= sty) 
the first step is to rewrite this as a differential form equation: 
(3.13) P(t,y)dt+ Q(t,y)dy = 0 


This can be done with the following steps: 
(Step 1) “Multiply” both sides of (3.12) by dt to get: 
dy = f(t,y)dt 
(Step 2) Subtract from both sides f(t, y) dt to get: 
—f(t,y)dt+dy = 0 
(Step 3) If necessary, multiply both sides by some carefully chosen integrating fac- 
tor p(t, y): 
—f(t,y)ult,y) dt + u(t,y)dy = 0 
Step 3 is the most important step, as this puts the differential form equation into a 
form we can “integrate” (i.e., compute an inverse of the differential d). We will see 
through examples some heuristics for how to do this for certain families of functions. 
We will also show how to check if the differential form equation can be solved. In 


the worst case, however, finding the right integrating factor p(t, y) requires solving 
a PDE. 


Separable differential equations. As a warmup, we will study a family of 
equations for which this process always works, the so-called separable differential 
equations: 


Definition 3.5.1. A separable equation is an explicit first-order differential 
equation of the form: 


(i) either 
Y = Fo) 
(ii) or 
dy _ fit) 
dt ——g(y) 


where f,g are one-variable functions. Note that every equation of the form 
(ii) is also an equation of the form (i): 


dy 1 

4 = (25) = fone 
where h = 1/g. Thus we will restrict our attention to equations of the form 
(i). 

The reason that a separable equation is called “separable”, is because we can sep- 


arate the variables t and y when performing Steps 1-3 above. Here are some exam- 
ples: 


Example 3.5.2. Here are some examples of separable equations and the corre- 
sponding “separated” differential form equation: 
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1) # =ty. In this case, Step 1 and Step 2 yield: 
dt y 


—tydt+dy = 0. 
Now multiply both sides by 1/y to obtain: 
d 
-td+—% = 0. 
oT] 


(2) & =e'¥, Recognize this equation as dy/dt = ete~¥. Then Step 1 and 


Step 2 yield: 
—e'e Ydt+dy = 0 
Multiplying both sides by e¥ then gives us: 
—e'dt+e¥dy = 0 
(3) dy =ty+y. Rewrite this as dy/dt = (t+ 1)y. Then get: 
—(t+1)ydt+dy = 0 
and multiplying by 1/y yields: 


d 
(t+ I) dt+ = =, 


These examples show that in general a separable equation 


dy 
—~ = f(t 
a 7 A Moly) 
gives rise to the differential form equation 
dy 
—f(t)dt+ —~ = 0 
ay g(y) 


Since each differential dt and dy has as coefficient functions a one-variable function 
in the same variable, we can “integrate” this differential form equation using the 
following: 
Observation 3.5.3. Given a separated differential form equation: 
(3.14) P(t)dt+ Q(y)dy = 0 
Define the two-variable function: 
F(t,y) = [ro a+ | Qudy 
Then 
3) 0 
dF = Bi P(t) dt} dt+ By Q(y) dy } dy = P(t) dt + Q(y) dy 
Thus, the implicit equation 
Fy) = C 
where C' € R is arbitrary, is the “general solution” to the differential form equa- 


tion (3.14). 

In other words, to integrate a separated differential form equation, you just compute 
two one-variable integrals, a dt-integral and a dy-integral. Each one gives you a 
constant of integration, but these constants of integration can be combined into 
one and put on the righthand side of the equation. We illustrate this with a few 
examples: 
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Example 3.5.4. Continuing with our examples from 


(1) 


Given our differential form equation 


d 
#4 = 0 
y 


we integrate both parts of the lefthand side separately to get: 


d i 
/ tdt -| a +In|y| = C. 
y 2 


Thus the general solution, as an implicit equation, is: 


2 


t 
ans +Inly| = C. 


Given our differential form equation 
—e'dt+e%dy = 0 
we integrate to get: 
[reas ferdy = -e +e" = C. 
Thus the general solution, as an implicit equation, is: 
-e+e% = C. 
Given our differential form equation 


d 
idee = i 
y 


we integrate to get 


d t+ 1)? 
fer nars | ot EE ei 
y 2 
Thus the general solution, as an implicit equation, is: 
t+ 1)? 
EN" nly = C. 
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Here is a convention for this class involving separable (and also exact) equations 


below: 


Convention 3.5.5. If we ask for the general solution to a separable or exact 
differential equation, you may leave the general solution in implicit form unless we 
specifically ask you to put it in explicit form, in which case you have to solve for y 


in terms of C. If a term |y| = u(t; C) shows up, then this simplifies to y = 4 


tu(t; C). 


Example 3.5.6. We will continue with the three examples from Example [3.5.4] 
giving the general solution in explicit form: 
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(1) Our general solution in implicit form is —t?/2 +In|y| = C. Solving for y 


yields: 
t2 
In|y(t)| = 5 +C 
t? 
| = ex (5 +C) 
t2 
y(t) = +exp (5 + c] (general solution) 
(2) Our general solution in implicit form is —e’+e¥ = C. Solving for y yields: 
-é+e% =C 
ec = &+C 
y(t) = In(e'+C) (general solution) 


Note: we do not put absolute values in the last step. The second equa- 
tion eY = ef + C tells us that e6 + C must be positive. This places 
additional conditions on the constant C’ and the domain of the general 
solution (which will be a function of C) — something we will not bother 
with. 

(3) Our general solution in implicit form is —(t + 1)?/2 + In|y| = C. Solving 
for y yields: 


t+1)? 
ny = AY 5c 
2 
no] = ew (SS +c) 
2 
y(t) = exp (SY +0) (general solution) 


Here is the convention for initial value problems: 


Convention 3.5.7. Suppose our separable or exact differential equation as implicit 
general solution: 


Fit,y) =C 
and we also have an initial condition y(to) = yo. Then: 


(1) First solve for C by noticing C = F(to, yo). If we do not explicitly ask for 
the particular solution in explicit form, then you may stop here. 

(2) If we do ask for the explicit solution, then solve F(t, y) = C (with the new 
exact value for C) for y, using the initial condition y(to) = yo anytime you 
have to make a choice (e.g., dealing with absolute values, or square roots). 
The interval of existence will be the largest interval which contains tg for 
which y(t) is naturally defined. 


Example 3.5.8. Find the particular solution (in explicit form) for the following 
initial value problem: 


(i) y' =ty 
(ii) y) =1 
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SOLUTION. We have found the implicit general solution in Example to be: 
t2 
-3 +In|y| = C 


Solving for C' yields: 


C= —* + In = -5. 
Next we solve for y(t): 
5 +m|y(l = 3 
no] = 5-4 
Wil = ex (5-5) 
ui) = e (5-5) 


Here we needed to take the righthand side to be positive when we removed the 
absolute values because yo = 1 is positive. The interval of existence is all of R as 
this is the natural domain of the righthand function of t. 


We end our separable discussion with a remark about dividing by zero: 


Remark 3.5.9. Suppose we have an initial value problem: 
(i) y’ = f@g(y) 
(ii) y(to) = yo. 
and g(yo) = 0. Then the constant function y(t) = yo for all t is a solution and the 


interval of existence is the largest possible interval which contains to for which f(t) 
is defined. 


Exact differential equations. Now we move on to the general case: 


(3.15) y = f(ty) 


where f is a two-variable function which might not be separable (i.e., it might not 
be of the form f(t,y) = g(t)h(y)). Recall that the first order of business is to 
translate equation (3.15) into a suitable differential form equation: 


(Step 1) Rewrite as dy = f(t,y). 


(Step 2) “Multiply” both sides by dt, then add — f(t, y) dt to both sides to obtain: 
—f(t,y)dt+dy = 0 
(Step 3) Multiply both sides by a carefully chosen integrating factor p(t, y): 
—f(t,y)ult,y) dt + w(t,y)dy = 0. 
This will give us a differential form equation: 
(3.16) P(t,y) dt+ Q(t,y)dy = 0. 


Of course, we have said nothing yet about how to find the integrating factor p(t, y), 
or what it needs to do. Ultimately, to solve a differential form equation of the 
form (3.16), we need to find a so-called potential function: 
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Definition 3.5.10. A potential function for (3.16) is a two-variable function 
F(t,y) such that 


Ot Oy 


(1) SF = P(t,y), and 
(2) 8F = Qty). 
In other words, a potential function is like an antiderivative of a differential form. 


Unfortunately, not every differential form has a potential function. This begs the 
question: 


Question 3.5.11. When does the differential form P(t, y) dt + Q(t, y) dy have a 
potential function? 


ANSWER. First, we will define what it means for a differential form to have a 
potential function: 


Definition 3.5.12. Suppose P,Q : D — R are continuous two-variable functions 
on a nice domain D C R?. We say that the differential form 


Pdt+Qdy 
is exact if there exists a continuously differentiable function F' : D — R such that 
dF = Pdt+Qdy. 


Next, we isolate a necessary condition (which is easily checkable) for a differ- 
ential form to be exact. We will further assume that P and Q are continuously 
differentiable (this will be the case for all the functions we shall encounter). Sup- 
pose F(t, y) is a potential function of P(t, y) dt + Q(t, y) dy, so F' will have to have 
continuous second-order partial derivatives (in order for P and Q to have continu- 
ous first-order partial derivatives). Then by the Clairaut-Schwarz Theorem [2.3.13] 
it follows that: 


PF OF 
Otoy —- Oydt 
Thus, since on = P and or = Q, then this says that 
a ee 
Ot — Oy 


i.e., the partial derivatives of P and Q with respect to the other variable must be 
the same. This motivates the following definition: 


Definition 3.5.13. Suppose P,Q : D — R are continuously differentiable two- 
variable functions on a nice domain D C R?. We say that the differential form 


Pdt+Qdy 
is closed if 
OP _8Q _ g 
Oy Oy 


i.e., if the lefthand side is the constant zero function. 
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Clearly, in order for a differential form to be exact, it must also be closed (which 
is a very easy condition to check). What about the converse? As it turns out, if we 
impose a natural condition on the domain D, then these two are equivalent: 


Theorem 3.5.14. Suppose P,Q: I x J > R are continuously differentiable func- 
tions and I, J C R are intervals (so the common domain of P and Q is a rectangle). 
Then the following are equivalent: 


(1) the differential form P dt + Q dy is exact. 
(2) the differential form P dt + Q dy is closed. 


This provides an answer to the original question, namely, if the functions P 
and Q are nice (continuously differentiable, which they always will be for us), and 
the domain is a rectangle, then the differential form P dt + Qdy has a potential 


function iff P dt + Q dy is closed, i.e., iff = ae 


Here is an example which shows how Theorem ?? can fail if the domain of P,Q 
is not a rectangle: 


Example 3.5.15. Consider the differential form equation: 


me 
dt 
t2 +4 y? ar t2 +4 y? 


dy = 0. 


Here the domain of the coefficient functions is R? \ {(0, 0)}, i.e., the entire ty-plane 
except the origin. It is easy to see that this differential form is closed: 


However, the differential form is not exact. Assume towards a contradiction that 
there exists a potential function F : R? \ {(0,0)} + R. Then on the one hand we 
would have 


27 


— F(cos@,sin@)d@ = F(1,0)—-F(1,0) = 0. 
9 «60 


On the other hand, we have by the (multivariable) chain rule: 


oF (— sin 0) + Fe058 


Drees 6, sin 0) at 


dé 


sin 6 ee ee cos 6 658 
— = In  _- 
cos? @ + sin? 6 cos? 6 + sin? 6 
= 1, 
which implies that ie <4, F(cos6, sin 0) d@ = 2x #0. This is a contradiction, and 
so no such potential function F' can exist. 


We now provide an example of checking whether a given differential form is closed 
(and also exact): 
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Example 3.5.16. (1) (2t+ y) dt + (t — 6y) dy. First we compute the partial 
derivatives os and re 

0 

= (2t = 1 

By et 

0 

“(t-6y) = 1 

aye — Sy) 


Thus (2¢ + y) dt + (t — 6y) dy is closed. Since both P and Q are defined 
on the rectangle R x R, by Theorem [3.5.14] this differential form is exact, 
hence there exists a potential function for it. 

(2) (2t+Iny) dt + tydy. First we compute the relevant partials: 


2 ea inal) = A 
Oy y 
0 
ai ty) =y 


Since these partial derivatives are not equal, the differential form is not 
closed, hence it is not exact. 


The next order of business is to solve for a potential function of an exact differential 
form. This can be done with the following steps: 


Finding a potential function of an exact differential form 3.5.17. Suppose 
the differential form P(t, y) dt + Q(t, y) dy is exact. The solution to the differential 
form equation 
P(t, y) dt+ Q(t,y)dy = 0 
is F(t, y) = C, where F is a potential function of P(t, y) dt+ Q(t, y) dy. A potential 
function F’ can be found in the following steps: 
(1) First solve on = P by integrating with respect to t: 


(3.17) F(t,y) = / P(t,y) dt + 4(y) 


where $(y) is an unknown function of y only. Here ¢(y) plays the role 
of “constant of integration”, except that since we are considering partial 
derivatives and integrating with respect to ¢ only, we have to allow our 
constant of integration to in fact be a function of y. 

(2) Next, we need to find what ¢(y) is. Since we know a = Q(t, y), we can 


differential (3.17) with respect to y: 


0 ! _ 
5, | Peswat+ ow) = Qty), 
and thus 


ow) = f (wn -F / P(tsy)at) dy 


(3) Now that we know what function ¢(y) is, our general solution (in implicit 
form) is: 
Fi(t,y) = C. 


We give some examples as to how this process works: 
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Example 3.5.18. In each of the following examples, the differential form is exact 
and we will solve the indicated differential form equation. 
(1) (2tsiny + ye’) dt + (t? cosy + 3ye') dy = 0. First we verify that the 
differential form is exact. Indeed: 


6) 
5, (2tsiny + Pe’) = 2tcosy + 3y7e" 
y 


O 
ae cosy + 3y%e") = 2tcosy + 3y7e? 


Now we will find a potential function F(t, y) for this differential form. 
First using that 2° = 2tsiny + ye’, we get that 


F(t,y) = [tsiny + ued dt+ 0) = Psiny+ ye’ + o(y) 


for some unknown function ¢(y) which is solely a function of y. Next we 
take the partial derivative of this F’ with respect to y and set it equal to 


t? cosy + 3y7e°: 
OF ) 9.5 St 2 2 ot / 2 2 ot 
an aylt siny+y"e + o(y)) = t cosy + 3y"e' + d/(y) = t cosy + 3y"e 


Thus ¢’(y) = 0. Integrating with respect to y finally yields ¢(y) = C. 
Thus our potential function is: 


F(t,y) = t@sinyt+y%e’+C. 
We conclude that our general solution is: 
Psinyt+y%e+C = 0. 


Replacing C' with —C, this general solution is equivalent to: 


(2) (1+ (1+ty)e™) dt+(1+t?e) dy = 0. First we verify that the differential 
for is exact. Indeed: 


I 


a) 
—(1+ (1+ tye”) (l+ty)e*t+te™ = e%(2t+ ty) 


Oy 
6) 

at +¢e¥) = Ite +Pey = ef¥(2t+ ty) 

Now we will find a potential function for this differential form. First, using 
that 37 = 1+ (1+ ty)e'Y, we get that 


F(t,y) = fo + (1+ tye”) dt+ oly) = te” +t+(y) 


for some unknown function ¢(y). Next, we take the partial derivative of 
this F with respect to y and set it equal to 1 + te": 


a = 5, (tel +¢ + oly) = PeM+¢d'(y) = 1+t?e¥ 


Thus ¢/(y) = 1. Integrating with respect to y yields ¢(y) = y+C. We 
conclude that our potential function is: 


F(t,y) = te™+t+y+C 
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and thus our general solution is: 
te’ +t+y+C = 0. 


The integrating factor ji(t,y). We have not said anything about the inte- 
grating factor yet. Its role is as follows: 


The integrating factor makes a non-exact equation exact. 
Specifically, here is the definition: 


Definition 3.5.19. Suppose P,Q : D — R are continuous on a nice domain 
D C R?. We say that a function » : D > R is an integrating factor for the 
differential form equation 


P(t, y)dt+ Q(t,y) dy = 0 
if 
(i) u(t, y) £ 0 for every (t, y) € D, and 
(ii) u(t, y) P(t, y) dt + p(t, y)Q(t, y) dy is exact. 
In particular, if D C R? is a rectangle, then by Theorem [3.5.14] (ii) is satisfied if 
and only if: 


5, (u(t. P tty) = 5, (u(t. Qtt.9) ce 


ie., if and only if: 


) ) 0 OP 
(3.18) Oly) tuts” = GePlu) + uly) Se 
In general, if a differential form equation is not-exact, then finding an integrating 
factor involves solving the partial differential equation (PDE) given in (3.18). This 
can be hard/impossible to do. For this reason, we will not study techniques for 
finding this integrating factor in this class. Here are the conventions for this class 
as to what you’re expected to know how to do with regards to this integrating 
factor: 


Convention 3.5.20. You need to know how to do the following things for this 
class: 
(1) Be able to check if a differential form equation is exact, and solve it if it 
is exact. 
(2) Given a non-exact differential form equation, and supplied with a valid 
integrating factor, you need to be able to use the integrating factor to 
solve the equation. 


Here is an example of solving a non-exact differential form equation after being 
supplied with a valid integrating factor. 


Example 3.5.21. Consider the differential form equation (3t?y+2ty+y?) dt+(t?+ 
y’) dy = 0 and the integrating factor p(t, y) = e®'. First note that the differential 
form is not exact: 
a) 
ay ory + 2ty+y?) = 374 2t4 3y? 
6) 
ot 


(iP +y?) = 2¢ 
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however, multiplying through by p(t, y) = e®* yields the differential form equation: 
3t7y + 2ty + y*)e* dt + (t? + y")e* dy = 0 
y yTyY 

which is exact: 
6) 
ay ety +2ty+y je = (3¢7 + 2¢ + 3y7)e™ 

3) 

Ot 
Now we will solve for the potential function. Using 2° = (3t?y + 2ty + y?)e* we 
get 


(2 +y2)e% = (P+ y)3e% + ate = (31? +3y? + 2t)e# 


2 
Fay) = [eeu + 2ty + ye dt + d(y) = y(t? + 5) + o(y) 


Next, taking a partial derivative with respect to y and setting this equal to (t? + 
y’)e*" yields: 


We conclude that ¢’(y) = 0. Integrating this with respect to y yields ¢(y) = C. 
We conclude that our potential function is: 


Fit.y) = (@+y*)e"+C 
and thus our general solution is: 


(P+ye*+C = 0. 


3.6. Existence and uniqueness theorems 


We have already seen the full existence and uniqueness theorem for first-order linear 
differential equations (Theorem |3.3.8). In this section we will give statements of 
other existence and uniqueness theorems. 


We have already given the relevant existence and uniqueness theorem for first-order 
linear differential equations in Theorem [3.3.8] The following is the corresponding 
statement for separable differential equations. Note that in general for separable 
differential equations, we are only guaranteed local uniqueness, i.e., a unique solu- 
tion on a tiny interval I’ which contains to (provided g(yo) 4 0). At this level of 
generality, we can’t really say what the largest possible interval of existence will be 
(unlike the statement of Theorem 3.3.8), although in practice you may be able to 
determine this when solving for the explicit solution to an IVP. 


Existence and Uniqueness Theorem 3.6.1 (Separable case). Suppose f : I > 
R andg: J > R are continuous functions defined on intervals I and J. Consider 
the initial value problem: 


(i) y = f®g) 
(it) y(to) = yo, where to € I, yo € J. 
(1) If yo is not an endpoint of J and g(yo) #0, then 
(a) the initial value problem (i)+(ii) has a unique solution y(t) : I’ > R, 
where I' C I is some open interval containing to, and 
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(b) the solution to the initial value problem can be obtained by solving for 
y in the following equation: 


[ras = [ro 


(2) If g(yo) = 9, then the constant function y(t) = yo, y: 1 > R, is a solution 
to (i)+(t), but it may not be unique. 


We now present the main existence theorem for explicit first-order differential equa- 
tions: 


Existence Theorem 3.6.2 (General case). Suppose f : Ix J > R is a continuous 
two-variable function defined on a rectangle I x J in the ty-plane (so I,J C R are 
intervals). Then given any point (to, yo) € I x J, the initial value problem 


(i) y' = f(t,y) 

(ti) y(to) = yo 
has a solution y(t) define on some interval I' C I which contains to. Furthermore, 
the solution will be defined at least until the solution curve t + (t,y(t)) leaves the 
rectangle I x J. 
The following example illustrates what we mean by “leaving the rectangle”: 
Example 3.6.3. Consider the IVP: 

gaia 

(ii) y(0) =0 
For this differential equation, the function f(t, y) is f(t, y) = 1+y?, which is defined 
everywhere on the ty-plane. Thus we can consider its domain to be the rectangle 
R xR. Solving this as a separable equation yields the solution y(t) = tant. The 
interval of existence is (—7/2, 7/2), since this is the interval in the domain of tant 
which contains tj = 0. This agrees with the Existence Theorem since y(t) 
“leaves the rectangle” at +7//2 in the sense that it has vertical asymptotes at these 
t-values, so it shoots down/up to +00 at these points and “leaves” the ty-plane. 


We also have the main uniqueness theorem for explicit first-order differential equa- 
tions. Note that the uniqueness theorem requires stronger hypotheses than the 
existence theorem, so it holds in fewer situations. 


Uniqueness Theorem 3.6.4 (General case). Suppose f : I x J > R is a contin- 


uous two-variable function defined on a rectangle I x J in the ty-plane (so I, J CR 
are intervals). Furthermore, suppose the partial derivative a exists and is contin- 


uous on all of Ix J. Let (to, yo) € I x J, and suppose we have have two solutions 
y(t), y(t) to the same IVP: 

(1) y(t) = f(t, y(t)) and y(t) = f(t, g(t) for every t, and 

(2) y(to) = yo and ¥(to) = yo- 
Then for every t such that (t,y(t)) and (t,9(t)) remain in the rectangle I x J, we 
have 


y(t) = g(t). 
One of the practical benefits of the Uniqueness Theorem is that, provided the 
hypotheses of|3.6.4) are satisfied, then 


Different solution curves cannot cross. 
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Here is a (somewhat exaggerated and contrived) example of this principle: 
Example 3.6.5. Consider the differential equation: 


=e 


(3.19) y’ = (y—10)sin(2 + y)e® 


and suppose that gy : I > R is a solution to the equation (3.19) on an interval I 
which contains 0. Furthermore, assume that (0) = 0. Then y(t) < 10 for all t € I. 


JUSTIFICATION. First note that 7 : J + R defined by g(t) := 10 for all t € I is 
also a solution of|/3.19| Then 7(0) = 10 whereas 4(0) = 0. Thus by the Uniqueness 
Theorem there can be no to € J such that 9(to) = H(to). Thus the two 
differentiable (hence continuous) functions 7 and ¥ never intersect. Finally, since 
y(0) < Y(0), it follows that for all ¢ € I that g(t) < 9(t) = 10 (since these functions 
cannot intersect). 


Notice that the inequality established in Example would be hard to estab- 
lish directly without the Uniqueness Theorem, since the equation (3.19) looks 
hard/impossible to solve exactly. 


3.7. Autonomous equations 


In this final section, we will take a look at qualitative properties of solutions of the 
so-called autonomous equations: 


Definition 3.7.1. A first-order differential equation is called an autonomous 
equation if it can be written in the form: 


i.e., if the equation does not depend on the independent variable t. 


Autonomous equations are a special case of separable equations, and hence could 
be solved using the methods from Section [3.5] However, we will be more interested 
in studying the qualitative properties of its solutions, i.e., saying as much as we can 
about the solutions without explicitly solving for them. 


Example 3.7.2. Consider the autonomous equation 


y = ¥t+l)y’-9) 


Below we have a sketch of the direction field along with several solutions curves: 
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FIGURE 3.4. Direction field for the autonomous equation y’ = 
(y + 1)(y? — 9) and several solution curves. 


Example is a rather typical example of an autonomous equation. We make 
a few remarks about what we see in Example which hold for all autonomous 
equations: 


Remark 3.7.3. Suppose y’ = f(y) is an autonomous equation. 


(1) The direction field does not change as you go from left to right, it only 
changes as you go from bottom to top. This is because the function 
f(t,y) = f(y) is only a function of y and does not depend on t. 

(2) Suppose yo(t) is a particular solution and C € R is a constant. Then 
yo(t + C) (a shift of yo to the left by C) is also a solution. Indeed: 


(yo(t+C))’ = y(t+C) = f(yo(t+C)) 


(3) Suppose yo € R is such that f(yo) = 0. Then the constant function 
y(t) := yo for all t is a solution to y’ = f(y). Such a number yo is called 
an equilibrium point and the constant function y(t) := yo is called an 
equilibrium solution. 


What about the nonequilibrium solutions? As Example illustrates, these 
solutions are strictly increasing/decreasing and will be asymptotic to one of the 
equilibrium solutions. For this we make the following observations: 


(1) Since y’ = f(y), if f(yo) < 0, then the solution going through the point 
(to, Yo) will be strictly decreasing. 
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(2) Likewise, if f(yo) > 0, then the solution going through the point (to, yo) 
will be strictly increasing. 


This qualitative behavior can be succinctly captured by a so-called phase line: 


Definition 3.7.4. A phase line for the equation y’ = f(y) is a plot of the y-axis 
(displayed horizontally) with the following features: 


(1) At every equilibrium point yo (ie., where f(yo) = 0), there is a dot. 

(2) In a region between two equilibrium points (or between an equilibrium 
point and +oo), if f(y) < 0 in that region, then there is an arrow to the 
left. This tells us that for these y-values, the solution is strictly decreasing. 

(3) Ina region where f(y) > 0, then there is an arrow to the right. This tells 
us that for these y-values, the solution is strictly increasing. 

(4) At each equilibrium point yo, if the two arrows on either side of yo are 
both pointing towards yo, then the dot at yo is filled in. Otherwise, the 
dot is not filled in. 


Often the phase line is plotted with a vertical f(y)-axis as well, superimposed with 
a graph of the function f(y). 


Example 3.7.5. Example of phase line of above example y’ = (y+ 1)(y? — 9). To 
be included. 


There are two types of equilibrium points: 


Definition 3.7.6. Consider the autonomous equation y’ = f(y). Suppose yo € R 
is an equilibrium point (i.e., f(yo) = 0). We say that yo is 

(1) asymptotically stable if a solution which goes through a point (to, yo + 
€), where |e| < 1 is very tiny, will asymptotically approach the solution 
y(t) = yo. These correspond to the filled-in dots on the phase line. 

(2) unstable if it is not asymptotically stable, i.e., if there is some solution 
which goes through a point (to, yo + €) which “peels off” and is not as- 
ymptotic to the solution y(t) = yo. These correspond to the non-filled-in 
dots on the phase line. 


In other words, asymptotically stable equilibrium points act like “sinks”, bringing 
nearby solution curves towards the constant solution at that point. Unstable equi- 
librium points, at least on one of the two sides, will “repel” nearby solution curves. 
Since the type of equilibrium point at yo is determined by the sign of the function 
f(y) on both sides of yo, if we know whether f is strictly increasing/decreasing as 
it goes through yo we can determine its type: 


First Derivative Test for Stability 3.7.7. Suppose yo is an equilibrium point 
y 
for the autonomous equation y’ = f(y), and suppose f is differentiable. Then: 
1) if f'(yo) < 0, then f is strictly decreasing at yo and yo is asymptoticall 
¥ y gary y y y 
stable, 
(2) if f'(yo) > 0, then f is strictly increasing at yo and yo is unstable, 
3) if f’(yo) = 0, then no conclusion can drawn and further investigation is 
( y g 
needed. 


This suggests a general procedure for plotting a direction field with various solution 
curves: 
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By studying the function f(y), first construct the phase line, including 
classifying the equilibrium points as either asymptotically stable or un- 
stable, 

In the direction field, plot the equilibrium solutions. 

In the other regions, plot solution curves that behave according to the 
phase line: if the phase line points to the left, the solution should be 
strictly decreasing and asymptotic to the next lower equilibrium solution 
(or diverge to —oo). If the phase line points to the right, the solution 
should be strictly increasing and asymptotic to the next higher equilibrium 
solution (or diverge to +00). 


79 


CHAPTER 4 


Second-order linear differential equations 


Recall that an explicit second-order differential equation is an equation of the form 


y” = fewy) 


where f is a three-variable function. A solution to this equation is a function y(t) 
which is at least twice-differentiable such that for every t, 


y"(t) = f(ty(t),y'(t) 


In this chapter, we will study a very special type of second-order differential equa- 
tion, the so-called linear second-order differential equations. 


4.1. Overview of second-order linear equations 


In general, second-order differential equations (in the fullest generality) are an order 
of magnitude more complicated than first-order differential equations. For this 
reason, we will restrict our attention to the simplest type of second-order differential 
equation, the second-order linear differential equations. As we shall see, there is 
much we can say about these equations and they have many practical applications. 


Definition 4.1.1. A second-order linear differential equation is a differential 
equation which can be put in the form: 


y"(t) + p(t)y’ + a(t)y = g(t) 


where the coefficient functions p,q,g are functions of the independent variable 
t only. The function g(t) is referred to as the forcing term. If g(t) = 0 is the 
constant zero function, then the differential equation 


y" +p(t)y'+a(ty = 0 
is said to be homogeneous. 
Here is a representative example: 


Example 4.1.2 (Simple harmonic motion). Consider the homogeneous second- 
order linear equation: 


y ne wy = 0 
where w € R is a constant with w # 0. Consider the functions: 
yi(t) = coswt and yo(t) = sinwt 
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We claim that these are both solutions (in Section [4.2] we will learn how one finds 
these solutions). Indeed, note that: 


y(t) = —wsinwt 
y,(t) = weoswt 
y(t) = —w? coswt 
ys (t) = —w* sinwt 
and thus 
yf (t) +w?y(t) = —w? coswt+w*coswt = 0 
and 
y(t) +wye(t) = —w*?sinwt+w*sinuwt = 0. 


Are there any other solutions? In this section we will study general properties of 
the set of solutions to a second-order linear equation. We will not learn techniques 
for actually solving second-order equations in this section, but instead just assume 
(for the moment) that we have some method of obtaining solutions. Along these 
lines, the following is relevant: 


Existence and Uniqueness Theorem 4.1.3 (Second-Order Linear). Suppose 
p.a,g:I—R are continuous functions with domain I C R an interval. Then given 
to € I and any two real numbers yo, yi € R there is a unique function y: I> R 
which satisfies the initial value problem: 


(i) y" + p(t)y’ + a(t) = g(t) 
(it) y(to) = yo and y'(to) = y1- 


In example we saw that we had at least two solutions y(t), yo(t) to the 
equation y” + w*y = 0. For homogeneous linear equations, given two solutions, we 
can mass-produce many more solutions. 


Definition 4.1.4. Suppose yj, y2 : J > R are two functions defined on an interval 
ICR. A linear combination of y; and yo is any function of the form: 


Cy + Coy2 :I oR 
where C1, C2 € R are constants. 


For example, 3coswt — 7sinwt is a linear combination of coswt and sinwt. The 
following proposition says that the collection of all solutions to a homogeneous 
second-order linear equation is “closed under linear combinations” : 


Proposition 4.1.5. Suppose yi (t), yo(t) are solutions to the homogeneous second- 
order differential equation 


y” + p(t)y’+a(t)y = 0. 


Then for any C1,C2 € R, the function Cry, + Coy is also a solution. 
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ProoF. Let C),C2 € R be arbitrary. Note that 
(Cry1 + Coy2)” + p(t)(Cry1 + Coy2)! + g(t)(Ciys + Coy2) 
= (Cry + Cayz) + v(t)(Cryi + Coyg) + a(t) (Cig + Cay) 
(because the derivative is linear) 
= Cry + v(t)Cry + a(t)Ciyr + Cay” + p(t)Coyo + a(t)Caye 
= Ci(y + p(t)y, + a(t)y1) + Ca(ye + p(t)ys + a(t)y2) 
Oi OP Os6 


because y; and yp both solutions. Thus Cy; + C2y2 is also a solution. 


When are two solutions “essentially different”? This is captured by the notion of 
linear independence: 


Definition 4.1.6. Suppose y;,y2 : J — R are functions defined on an interval 
ICR. We say that y; and y2 are linearly independent if: for every C,,C2 € R, 
if 
Cyyi(t) + Coye(t) = 0 for every t € J, 

then C, = C2 = 0. In other words, y; and y2 are linearly independent if the only 
way for a linear combination of y; and y2 to be the constant zero function is with 
the trivial linear combination Oy; + Oy2. If y1 and ye are not linearly independent, 
then we say they are linearly dependent. 

For two functions y; and y2 to be linearly dependent, this means that either 
yi is a constant multiple of yo (i.e., y: = Cy2 for some C € R) or ye is a constant 
multiple of y: (y2 = Cy: for some C' € R). 


Linear independence is ultimately a linear algebra concept and it is one of the most 
important definitions in undergraduate mathematics. 


Example 4.1.7. Here are some examples of pairs of linearly (in)dependent func- 
tions: 


(1) The functions y; = cost and yp = sint are linearly independent. 
JUSTIFICATION. Suppose C1, C2 € R are arbitrary such that 
(tT) Ci cost + Czsint = 0 for every t€ R. 


We must show that it must be the case that Cy = C2 = 0. Since (}) holds 
for all t € R, it holds for to := 0. Plugging in this t-value tells us: 


0 = Cyicos0+CosinO0 = C,-14+C.:0 = Qj 


and so C; = 0. Likewise, ({) must also hold for t, := 7/2. Plugging in 
this t-value tells us: 


0= C, cos 7/2 + C2 sin7/2 = Cy, -04+C-1 => Co, 


and so Cp = 0 as well. Since Cy = C2 = 0, we conclude that cost and 
sint are linearly independent. 


(2) The functions e’ and 2e‘ are not linearly independent (i.e., they are lin- 
early dependent). 
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JUSTIFICATION. Note that for C, := 2 and Cy = —1 we have 


Cie’ + Co2e' = 2e*-—2e' = 0 for every t ER, 


however C and C2 are not both zero. 


— 
w 
ee 


Suppose f : R —> R is any function and g : R — R is the constant 
zero function (g(t) := 0 for all t € R). Then f(t) and g(t) are linearly 
dependent. 


JUSTIFICATION. Note that for C, := 0 and Cy := 1 we have 
Cy f(t) + Cog(t) = 0- f(t)+1-0 = 0 foreveryt ER, 


although C, and C2 are not both zero (only Cy is zero, but we need both 
of them to be zero in order to conclude linear independence). 


As Example [4.1.7] illustrates, it can sometimes be a little tedious to show di- 
rectly that two functions are linearly independent. Miraculously, for differentiable 
functions there is a much more systematic way to determine the linear depen- 
dence/independence of a pair of functions. This involves computing the so-called 
Wronskian: 


Definition 4.1.8. Suppose u,v : J > R are two differentiable functions defined on 


an interval J C R. Define the Wronskian of u and v to be the function W : J > R 
defined by 


for all t € J. 


You might think that the Wronskian W(t) could in general be any function, but in 
fact it satisfies the following surprising dichotomy: 


Proposition 4.1.9 (Wronskian dichotomy I). Suppose p,q,u,v :I > R are func- 
tions defined on an interval I C R such that u and v are solutions to 


y" +p(t)y’+a(t)y = 0 


Let W(t) be the Wronskian of u and v. Then exactly one of the following two things 
is true: 


(Case 1) W(t) = 0 for allt € I, or 
(Case 2) W(t) £0 for allt ETI. 


PROOF. We are assuming that both u and v satisfy: 


u’+pu'+qu = 0 and v’+pv'+quv = 0. 
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We wish to show that W = uv’ — vu’ is either everywhere zero, or everywhere 
nonzero. First, differentiate W: 
W = uw" 4+u'd' — vu" — vu’ 


1 
= uv” — UU 


= u(—pv’ — qv) — v(—pu’ — qu) 
because u,v are solutions 
= —puv’ — quv + pou’ + quu 
= —p(uv’ — vu’) 
= —pW. 
Thus, the function W(t) is a solution to the first-order linear homogeneous equation 


W’+pW =0. Pick to in the domain of W, and suppose W(to) = Wo. Then by 
Theorem [3.3.6] we have that 


W(t) = Woexr(— f Ws) as) 


Thus, if Wo = 0, we are in Case 1. Otherwise, if Wo 4 0, we are in Case 2, since 
the exponential function is never zero. 


Proposition essentially says that W(t) must be always zero or never zero. 
It can’t be sometimes zero and sometimes not-zero. The dichotomy in Proposi- 
tion [4.1.9] gives rise to the linear dependence/independence dichotomy: 


Proposition 4.1.10 (Wronskian dichotomy II). Suppose p,q,u,v : I > R are 
functions defined on an interval I CR such that u and v are solutions to 


y" + pty’ +a(ty = 0 
Let W(t) be the Wronskian of u and v. Then: 
(Case 1) if there is some to € I such that W(to) =0 (which implies W(t) = 0 for 
allt € I), then u and v are linearly dependent, and 
(Case 2) if there is some to € I such that W(to) 4 0 (which implies W(t) 4 0 for 
allt € I), then u and v are linearly independent. 


PROOF. Case 1: Assume first that we are in Case 1, i.e., there is some tg € J such 
that W(t)) = 0. Then by Proposition [4.1.9] we know that W(t) = 0 for all t € J. 
We have two subcases. 

Case 1(a): Assume that v(t) = 0 for every t € J. Then 1- v(t) + 0- u(t) =0 
for every t € J and so u and v are linearly dependent. 

Case 1(b): Assume there is to € I such that v(to) 4 0. By the Bump 
Lemma |2.2.7| there is a < ty < 8 such that v(t) 4 0 for every t € (a,8)N I. On 
this interval (a, 8) VI, we have 

du wy-u WO 0 
dtu v? ye ; 
Thus by Corollary [2.3.7|there is a constant C' € R such that u(t)/v(t) = C for every 
te (a, B) NI. Le., u(t) = Cv(t) for every t € (a, 8) NTI. In particular, both u(t) 
and C'v(t) are solutions to the IVP: 
(1) y” + py’ + ay = 0 
(2) y(to) = u(to), y’(to) = u'(to). 
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By the Existence and Uniqueness Theorem 4.1.3} we conclude that u(t) = Cv(t) 
for every t € J. Thus u(t) and v(t) are linearly dependent. 

Case 2: Suppose there is to € I such that W(to) #0. By Proposition [4.1.9] we 
know that W(t) 4 0 for allt € I. Assume towards a contradiction that u(t), v(t) are 
linearly dependent. Thus there exists constants C,,C2 € R such that (C1,C2) 4 
(0,0) and that C,u(t) + Cgu(t) = 0 for every t € I. This gives us two cases: 

Case 2(a): Suppose C; #0. Then for C := —C2/C\ we have u(t) = Co(t) for 
every t € I. Thus the Wronskian is: 


W(t) = uw! —vw = Cov’ —v(Cv)' = 0, 


a contradiction. 
Case 2(b): Suppose Cz 4 0. This case is similar. 


Example 4.1.11. We return to the first two examples from Example 
(1) Consider the equation: 


y’+y = 0 
This has solutions y; = cost and yg = sint. Next we compute the Wron- 
skian: 
W(t) = cost(sint)’ —sint(cost)’ = cos*t+sin?# = 1. 
We see that this is everywhere 4 0. Thus by Proposition we con- 


clude that yj, yo are linearly independent. 
(2) Consider the equation: 


y” —2y'’+y = 0 


We see that y, = e’ and yo = 2e! are both solutions. Next we compute 
the Wronskian: 


W(t) = e’(2e")! — 2e*(e’)’ = %%! _ 9-2 = QO 


Since W(t) = 0 for all t, we conclude by Proposition|4.1.10|that y;, ya are 
linearly dependent. 


We now arrive at the main result of this section: 


Theorem 4.1.12. Suppose yi, y2 are linearly independent solutions to the homo- 
geneous second-order linear equation 


y" +p(t)y’+aq(t)y = 0 
Then the general solution is: 
y(t;C1,C2) = Ciyr(t) + Caye(t). 


PRooF. Suppose y : J — R is an arbitrary solution to y” + py’ + gy = 0. We must 
show there exists constants C,,C2 € R such that y = Cyy; + Coye. Let to € I. We 
first must find constants C,,C2 € R which satisfy: 


Ciyi(to) + Coy2(to) = y(to) 
Cry (to) + C2yo(to) = y'(to) 


This is possible because yj, yz are assumed to be linearly independent, and thus 


Wta) = det [Hl wel!) x 0. 
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This implies that the above system has a unique solution. The function Cyy; +Co2ye2 
is also a solution to y” + py’ + qy = 0 by Proposition [4.1.5] Furthermore, both 
y: I> Rand Cyy, + Coy2 : I > R are solutions to the IVP: 

(i) y"” + py’ + ay =0 

(ii) y(to) = y(to), y'(to) = y'(to), 
and so by the Existence and Uniqueness Theorem [4.1.3] if follows that y = Cy, + 
Cy (i-e., these functions J + R are equal). This finishes the proof. 


Since a pair of linearly independent solutions to a homogeneous second-order linear 
equation is capable of producing all other solutions, we call such a pair a funda- 
mental set of solutions: 


Definition 4.1.13. A fundamental set of solutions to the homogeneous second- 
order equation 
y" +plt)y’ +a(t)y = 0 
is a pair yi, y2 of linearly independent solutions. Ultimately, “fundamental set of 
solutions” refers to the fact that the pair y, yz satisfies the following two properties: 
(1) y: and y2 “generate” all other solutions in the sense that the general 
solution is y(t; C1, C2) = Ciy1 + Coy, and 
(2) there is no “redundancy” among y; and yp (since they are linearly inde- 
pendent), i.e., both solutions are needed to generate all other solutions. 
[In linear algebra terms, a “fundamental set of solutions” is a basis of the subspace 
of all solutions.] 


Example 4.1.14 (Simple harmonic motion). Find the particular solution to the 
following initial value problem: 

(1) y” +w?y =0 (w € Ra constant, w # 0) 

(2) y(0) =1, y'(0) =2. 
We already know that y; = coswt and yo = sinwt form a fundamental set of 
solutions (since W(t) = w #0). By Theorem [4.1.12 the general solution is: 


y(E; C1, C2) = Ci cos wt + C2 sin wt 


for some C),C2 € R. We will use the initial conditions to solve for C,,C 2. First 
note that since y(0) = 1, we have 


1 = y(0) = Cicos0+Czsin0 = C. 


Second, taking a derivative of the general solution yields: 


y(t) = —Cywsinwt + Cow coswt 
and the condition y’(0) = 2 gives 
2 = y'(0) = —Ciwsin0+Cowcos0 = Cow 
and so Cp = 2/w. Thus the particular solution is 


2, 
y(t) = coswt + — sinwt. 
Ww 
Note: really we obtained a system of equations: 
1-C,+0-Cp = 1 
0-Cyt+w-Cy = 2 
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This particular system is immediate to solve, but in general it might be more 
complicated and require Gaussian Elimination (or whatever your favorite method 
of solving a 2 x 2 system is). 


We end by answering a question which is implicit in the above discussion: 


Question 4.1.15. Given p,q: I > R defined on an interval I C R, does there 
always exist two linearly independent solutions y,, yo : I — R of the homogeneous 
second-order linear differential equation: 


y" +plt)y’ +a(t)y = 0. 
ANSWER. Yes! By the Existence Uniqueness Theorem |4.1.3} we can arbitrarily 
choose tg € J and then obtain two solutions yj, y2 : J > R which satisfy the initial 
conditions: 
(1) y1(to) = 1, yi (to) = 0 
(2) yo(to) = 0, yo(to) = 1. 
We claim that y),y2 are linearly independent. Indeed, note that: 
yr (to) a4 F H 
W(to) = det = det = 1. 
(to) ite) He 01 
Thus the Wronskian of y 1, yo is nonzero at least at the value tg. By Proposi- 


tion |4.1.10}it follows that W(t) 4 0 for every t € I and also that yj, y2 are linearly 
independent. 


4.2. Homogeneous second-order linear equations with constant 
coefficients 


In this section we study a very special case of homogeneous second-order linear 
equations, those with constant coefficients: 

y" +py'+qy = 0 
where p,q € R are constant functions. The simple harmonic motion equation (Ex- 


ample |4.1.2) is already an example of such an equation. To study these equations, 
we need to introduce an auxiliary device, the so-called characteristic polynomial: 


Definition 4.2.1. The characteristic polynomial associated to the homoge- 
neous second-order linear equation 

y" +py'+qy = 0 
(where p,q € R are constant functions) is the quadratic polynomial 

f(A) = #+pdrA+¢q 
in the variable 4. A root of the characteristic polynomial is called a characteristic 
root. 


Recall that the quadratic formula gives us the roots of a quadratic equation: 


Mo = Pe /p? — 4q 
1; 2 = D) 
Furthermore, the nature of the two roots \j, Az fall into three cases, depending on 
the value of the discriminant p? — 4q: 
(1) If p? —4q > 0, then A, # 2 are distinct and both real numbers. 
(2) If p? — 4q =0, then A, = 2 are the same real number. 
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(3) If p? — 4q < 0, then Ay ¥ Az are distinct but they are not real numbers 
(they are complex numbers). 


We shall study these three cases separately. 


Distinct real roots. In this subsection, we fix a homogeneous second-order 
linear differential equation with constant coefficients: 


y” + py’ +qy = 0 
and we let 
fA) = #+pA+¢a 


be its characteristic polynomial. Furthermore, we assume that f has two distinct 
real roots A; and Ag. 


Theorem 4.2.2 (Distinct real roots). The general solution to 
y" +py' +ay = 0 
when A, # Ag are distinct and real is: 
y(t;C1,C2) = Gye +Cye*™*, 
PRooF. We first claim that e* is a solution, for i = 1,2. Note that 
(e%*)” + p(e*t) + ge! = d2e%! + pre? + ger? 
= (424+ pd; + ge 


= fire 
= Oe because A; is a root of f(A) 
= 0. 


Thus both e*"* and e*?! are solutions. Next, we claim they are linearly independent. 
Indeed, note that: 


W(t) = det | ‘ 


= Apert erz2t = Aner+#erz2# 


= (r2 — Ap)jeQr+Ar2It 


since Ap — A, # 0 and e1+2)* is never zero. Thus by Theorem|4.1.12]we conclude 
that e*1*, e2¢ is a fundamental set of solutions and that 


y(t;C1,C2) = Cye*® + Coe". 


is the general solution. 


Example 4.2.3. We will solve the IVP: 
(i) y” — 3y’ + 2y =0 
(ii) y(0) = 2,y’(0) =1. 
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First we compute the zeros of the characteristic polynomial: 
f(A) = AP -—3A+2 


By the quadratic formula, the two zeros are 


Thus by Theorem [4.2.2]the general solution is 
y(t;C1,C2) = Cre + Cre’. 
Now we need to use the initial condition to find the values of C,,C . First note 
that 
2 = y(0) = Cre?9+ Coe = C140 
and since y'(t) = 2C,e** + Cze*, we also have 
1 = y/(0) = 20,629 + Coe" = 20, +10, 
Thus we have a system of equations: 
Cy +C2g = 2 
20, +Co = 1 


There are many ways to solve this, one way is Gaussian Elimination: 


1 1/| 2) torrer |1 O} -1 
2 1/1 0 1] 3 
Thus C; = —1 and C2 = 3. We conclude that the particular solution to the IVP is: 
y(t) = —e™# + 3e°. 


Repeated real roots. In this subsection, we study the situation where 
y" +py'+qy = 0 


has repeated characteristic roots A; = Az. Note that the proof of Theorem [4.2.2] 
above already shows that e?" is a solution. This begs the following question: 


Question 4.2.4. How do we find a second linearly independent solution to y" + 
py’ +qy = 0? 


ANSWER. Let yi(t) = e"' be the first known solution. Since \; is a double root 
of f(A) = A? + pr +4, it follows that f(A) = (A— 1)? = A? — 2A, + A?. Thus 
p= —2Ai, ie., Ay = —p/2, and q = A? = p?/4. 

We shall guess that a second solution is of the form y2(t) = v(t)e**, where v(t) 
is an unknown function. We shall determine v(t). First we compute: 


yy = ertt(y’ + qv) 
ys = e**(y + 2d30' + Atv) 
Thus, in order for yp to be a solution, we need the following to be equal to zero: 
ys + pyy tayo = e®8(u" + 2Az0! + A2v) + pe™* (vu! + Axv) + que*? 
vu + 2d, v0! + Atv + p(v' +. A1v) + q) 


( 
= eM! (y" — pu! + p?u/4 + p(v' — pu/2) + p’v/4) 
ert 


ert 


M 
UV. 


89 


4.2. HOMOGENEOUS LINEAR EQUATIONS WITH CONSTANT COEFFICIENTS 81 


Since e*' ¥ 0 for all t, we require that uv” = 0. In this case, we get v = At +B 
is linear, and so we can take a second solution of the form y2(t) = (At + B)e>"’. 
Since we require that yo(t) is linearly independent with y;(t), it suffices to take 
yo(t) = te?, 


We summarize the above as follows: 

Theorem 4.2.5 (Repeated real roots). The general solution to 
y" +py'+qy = 0 

when Ay = A2 are not distinct (and real) is: 


y(t;C1,C2) = Cye™* + Cote™®. 


PROOF. This is a worksheet exercise. 


Example 4.2.6. We will solve the following IVP: 
(1) y” —2y’+y=0 
(2) y(0) = 2,y'(0) = —-1 
First we consider the characteristic polynomial: 
fO) = #-2\4+X = (A-1)?. 
We see that A; = Ag = 1 is a repeated root. Thus by Theorem [4.2.5] the general 
solution is 
y(t;C1,C2) = Ce’ + Cote 
Now we need to use our initial condition to solve for C,, C2. First note that 
2 = y(0) = Cye°+Co-0-e° = Ch. 
Next we differentiate our general solution: 
y(t) = (Ci +C2)e* + Cote’ 
to get 
-1 = y(0) = C,4+Cx.. 
This yields the system 
Ci = 2 
Cy+Cy = -1 
We see that Cy = 2, C2 = —3. Thus our particular solution is 
y(t) = 2e* — 3te’. 


Complex (non-real) roots. We finally consider the case where Ay # 2 € C, 
i.e., the case when the discriminant p? — 4q < 0 of the characteristic polynomial is 
negative which yields two distinct complex (non-real) roots. First we briefly recall 
some fact about the complex numbers: 

(1) A complex number is a number of the form z = a + bi, where a,b € R 
and i? = —1 is the imaginary unit. We denote the set of all complex 
numbers by C. 

(2) Given a complex number z = a+bi, we define its real part to be Re(z) := 
a and its imaginary part to be Im(z) := b. 

(3) Given a complex number z = a + bi, we define its complex conjugate 
to be Z:=a— bi. 
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(4) Here are some facts about the complex conjugate of a complex number 
z=a- bt: 


( 
(b) Re(z) = (2 +2)/2 
(c) Im(z) = (2 — Z)/2% 
(d) z=Ziffze Riffb=0. 
(e) for w € C we have z+ w=27+W and ZW =Z-U 
(5) The complex exponential function behaves according to Euler’s formula: 


ett — €%(cosb + isinb) 


(6) Suppose f(A) = A? +pA+q is a polynomial with real coefficients p,q € R 
and a complex (non-real) root \; = a+ bi. Then \2 := 1 = a— di is 
also a complex root, i.e., the complex roots of a real polynomial occur in 
complez conjugate pairs. 

(7) Suppose z(t) is a complex-valued function such that z(t) = x(t) + y(t)i, 
where x(t), y(t) are real valued functions. Then 


d d .d 
7rd) = qt) + ig) 


i.e., complex-valued functions can be differentiated by separately differen- 
tiating the real and imaginary parts in the usual way. 


If we allow ourselves to consider complex-valued functions as a solution to our 
differential equation, then we obtain the following analogue of the distinct real- 
roots case (Theorem |4.2.2): 


Theorem 4.2.7 (Distinct complex roots, complex version). The general solution 
to 


y"+py'+qy = 0 
when 1 =a+t bi, rA9 = a— bi are distinct and complex is: 


iO, C2) = Oye + Oee = Ce +e" 


Proor. The proof is the same as the proof of Theorem [4.2.2] 


Of course, ultimately we are interested in real-valued functions as solutions. For 
this, the following is useful: 
Observation 4.2.8. Suppose z(t) is a complex-valued function which is a solution 
to 
W / 
y +py +ay = 0, 
where p,q € R. Then: 


(1) the function z(t) is also a solution. 


Furthermore, since the set of all solutions is closed under linear combinations, it 
follows that: 


(2) Re(2(t)) is a real-valued solution, and 
(3) Im (2(t)) is a real-valued solution. 


This observation and Euler’s formula yield the following: 
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Theorem 4.2.9 (Distinct complex roots, real version). The general solution to 
y" +py' +aqy = 0 
when 1 = a+ bi, rA9 = a— bi are distinct and complex is: 


y(t; Ci, C2) = Cie" cos bt + Coe sin bt. 


PROOF. This is a worksheet exercise. 


Example 4.2.10. We will solve the following IVP: 
(1) yl” + 2y/ + 2y = 0 
(2) y(0) = 2,y(0) =3. 
First we consider the characteristic polynomial: 
fA) = 74242 
By the quadratic formula, we see that the characteristic roots are: 


—2+V/4-8 
A1,A2 = = -l+i 


Thus A; = a+bi = —1+1, where a = —1 and b= 1. By Theorem the general 
solution is 


y(t;C1,C2) = Cye* cost + Coe‘ sint 
Next we use our initial condition to solve for C1, C2. Note that 
2= y(0) = Cy 


Then we differentiate the general solution: 


y(t) = —e-*(C, cost + Cz sint) + e*(—C, sint + C2 cost) 
to get 
3 = y’ (0) = —Cy4+C>. 
This yields the system 
Cy = 
—-C1+Co = 3 


and so Cy = 2, Cp = 5. We conclude that our particular solution is 


y(t) = 2e~* cost + 5e* sint. 


4.3. The method of undetermined coefficients 


In this section we discuss a method for solving inhomogeneous second-order linear 
equations. The method is called the method of undetermined coefficients which 
also is sometimes called the method of (judicious) guessing. This method does not 
always work, but it works for a large enough class of differential equations that it is 
worth discussing. The first order of business is to discuss inhomogeneous equations 
in general. 
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Inhomogeneous equations. Recall that we are ultimately interested in second- 
order linear differential equations of the form 


y" +p(t)y’ + a(t)y = g(t) 


When the forcing term g(t) = 0 for all t, then the differential equation is homo- 
geneous; otherwise, it is inhomogeneous. We have already studied the structure of 
the general solution to a homogeneous equation in Section and we have seen 
how to solve homogeneous equations with constant coefficients in Section [£2] The 
following theorem tells us how to form the general solution of an inhomogeneous 
solution provided that we know the general solution of the corresponding homoge- 
neous equation and we are somehow able to obtain at least one particular solution 
to the inhomogeneous equation: 


Theorem 4.3.1 (General solution to inhomogeneous equation). Suppose yp(t) is 
a particular solution to the inhomogeneous equation 


(A) y" + p(t)y’ + q(t)y = g(t) 


and that yi(t), yo(t) form a fundamental set of solutions to the corresponding ho- 
mogeneous equation 


(B) y" +plt)y’ +a(t)y = 0. 
Then the general solution to the inhomogeneous equation (A) is 


y(t) = y(t;C1,C2) = Cryr(t) + Coye(t) + yp(t). 


PROOF. We need to show two things. First we will show that for any choice of 
C,,C2 € R, y(t) is indeed a solution to (A). Note that: 


(y(t))" + v(t)(y(t))’ + aut) 
= (Ciyi(t) + Coye(t) + p(t)” 
+ p(t) (Cryr(t) + Coyo(t) + yp(t))’ 
+ q(t)(Ciys(t) + Crya(t) + yp(t)) 
= (Cim. + Coy)” + p(t)(Ciys + Czye)' + q(t) (Ciyr + Coy2) 
+ Up + v(t)Yp + a(t)Yp 
because the derivative is linear 


O+ yy + p(t)y;, + a(t)¥p 
because Cy; + Czyz2 is a solution to (B) 


= g(t) because yp is a solution to (A). 
Thus y(t) = Ciy1 + Coy2 + yp is a solution to (A). 


Next we will show that an arbitrary solution y(t) of (A) must be of the form 
y(t) = Cry1 + Coy2 + yp for some choice of C,,C2 € R. Consider the function 
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y(t) := y(t) — yp(t). Note that 
y" + plt)y’ + a(t)y 
= (y—Yp)” + Plt)(y — yp)’ + a(t) (y — ¥p) 
= y"+p(t)y’ + ay — yp — plt)y, — a(t) ¥p 
because the derivative is linear 


= g(t) — g(t) 
because both y and y, are solution to (A) 
= 0. 


Thus y(t) is a solution to (B). Since y:(¢), yo(t) form a fundamental set of solutions 
to (B), there are constants C1,C2 € R such that y = Cyy1 + Coy2. Thus y(t) — 
Yp(t) = Ciyr + Coy2 and thus y(t) = Cry: + Cayo + yp. 


In other words, to find the general solution to an inhomogeneous solution 


y" +plt)y’ +a(t)y = g(t), 
you need to do the following: 


(1) First, find a fundamental set of solutions y;, y2 to the homogeneous equa- 
tion y” + p(t)y’ + q(t)y = 0 (possibly using techniques from Section [4.2] if 
p and q are constants). 

(2) Second, find one particular solution y, to the inhomogeneous equation 
y” + p(t)y’ + a(t)y = g(t) (possibly using the method of undetermined 
coefficients below if p and q are constants, or the method of variation of 
parameters from Section [4.4). 

(3) Third, write down the general solution: 


y(t) = Cryi(t) + Coye(t) + yp(t) 


(4) (If necessary) Fourth, if you are solving an IVP, then use the initial con- 
ditions to solve for the precise values of C,, Cz from the general solution 
in the same way you would solve an IVP for a homogeneous equation. 


Finally, we remark that Theorem [4.3.1] (and its proof) ultimately belongs to the 
subject of linear algebra (when viewed appropriately). Note that the only relevant 
feature from differential equations that got used in the proof was that the LHS 
is linear as a result of the derivative being linear (i-e., (f +g)’ = f’ +4’). We 
will revisit this theme of “general solution to inhomogeneous is general solution of 
homogeneous plus particular solution of inhomogeneous” in the next chapter. 


Method of undetermined coefficients. We now introduce the method of 
undetermined coefficients. This method allows us to find particular solutions of an 
inhomogeneous second-order linear differential equation 

y" + p(t)y’+a(t)y = g(t) 
provided: 
(1) p and q are constant functions, and 
(2) g(t) is a “nice enough” function. 
Ultimately, the method of undetermined coefficients involves guessing a so-called 
trial solution, and then plugging in that trial solution to determine a specific 
particular solution. We illustrate this first with an example: 
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Example 4.3.2. Find a particular solution to: 

y” + 3y' + 2y = de". 
SoLuTION. Here the forcing term is g(t) = 4e~%*. We will guess that there is a 
particular solution of the form y,(t) = ae~**, where a € R is an undetermined 
coefficient (i.e., an unknown coefficient we need to somehow determine). Thus in 


this case our “trial solution” is a function y,(t) = ae~ To find a, we plug the trial 
solution y,(t) into the equation: 


Vp + 3y}, + 2Yy, = 9ae~** — 9ae—** + Qae~** = 4e~**. 
This simplifies to 
(9a —9a+2a)e"** = 2ae~** = 4e* 


and so 2a = 4, i.e., a = 2. Thus the function y,(t) = 2e~** is a particular solution 
to y” + 3y' + 2y = 4e7%". 


How did we know to guess the trial solution ae~** in the above example? For many 


cases, the trial solution can be correctly guessed by using the following heuristics: 
(1) The trial solution should include the function g(t) as a special case. In 
the above example, g(t) = 4e~*¢ is also of the form ae~**. 
(2) The trial solution should be a family of functions “closed under the de- 
rivative.” In the above example, the derivative of a function of the form 
“ae—3*” is —3ae~® which is also of the form “ae~*“” (where “—3a” plays 
the role of “a”). 
In practice you can just look up the trial solution you are supposed to guess ac- 
cording to: 
Method of Undetermined Coefficients 4.3.3. Suppose y” + py’ + qy = g(t) is 
an inhomogeneous differential equation such that: 


(a) p,q € R are constants, and 
(b) g(t) is not a solution to the homogeneous solution y" + py’ + qy = 0. 
Then the following gives the trial solution you should guess depending on the form 
of the forcing function g(t) (where A, B,a,b,r,w € R, P(t) is a polynomial and 
po(t),pi(t) are polynomials of the same degree as P). If the forcing function g(t) 
is of the form... 

t 


(1) e", then the trial solution is yp(t) = ae". 
(2) Acoswt + Bsinwt, then the trial solution is yp(t) = acoswt + bsinwt. 
(3) P(t), then the trial solution is yp(t) = po(t). 

(4) P(t) coswt or P(t) sinwt, then the trial solution is 


Yp(t) = po(t) coswt + p;(t) sin wt. 

(5) e™ coswt or e™ sinwt, then the trial solution is 

Yyp(t) = e”(acoswt + bsinwt). 

(6) eT P(t) coswt or e™' P(t) sinwt, then the trial solution is 
yp(t) = e”(po(t) coswt + pi(t) sinwt). 


If g(t) is a solution to y” + py’ + qy, then use the trial solution ty,(t), and if that 
does not work, then use the trial solution t?y,(t). 


Here is an example which falls in case (2) in [4.3.3] 
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Example 4.3.4. Find a particular solution to 
y +4y = cos3t. 


SOLUTION. Since g(t) = cos3t, the Method of Undetermined Coefficients [4.3.3] 
tells us our trial solution should be y,(t) = acos3t + bsin3t, where a,b € R are 
undetermined coefficients we need to determine. First, we need to compute yj, and 


M, 
Yn: 


y(t) = —8asin 3t + 3b cos 3t 
Yp(t) = —Yacos 3t — 9bsin 3t 


Plugging this into the LHS of the differential equation yields: 
y, + 4yp = —Yacos 3t — 9bsin 3t + 4(acos 3t + bsin3t) = —5acos 3t — 5bsin 3t. 
This needs to equal cos 3t, so we get: 
—5acos 3t — 5bsin3t = cos3t = 1lcos3t+ Osin3t. 
This yields the system: 
—5a = 1 
—5b = 0 
so we find that a = —1/5 and b= 0. Thus we find that a particular solution is: 


1 
yp(t) = — cos 3t. 


Here is an example which falls into case (3) of 
Example 4.3.5. Find a particular solution to 
y” + 6y’ +8y = 2t-3. 


SOLUTION. Since g(t) = 2t — 3 is a polynomial of degree 2, the Method of Unde- 
termined Coefficients tells us our trial solution should be yp(t) = ait + ao 
(a polynomial of the same degree as g(t)), where aj,a9 € R are undetermined 
coefficients which we need to determine. First we compute yj, and y,: 


y, (t) = ay 
yp(t) = 0 


Next we plug y,(t) into the LHS of the differential equation to get: 
ue + Sy}, + 8yp = 04+ 5a, + 8(ait+a9) = 8ait + (5a; + 8a0) 
which needs to equal the RHS 2¢ — 3, which gives: 
8ait + (5a; + 8a9) = 2t—3. 
This yields the system: 
8a, = 2 

5a, +8a9 = —3 

We can solve this using Gaussian Elimination: 


8 O| 2] toRREF, |1 O 1/4 
5 8|-3 0 1] 17/32 
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Thus a particular solution is 


8t —17 
Yp(t) = 32 


The following superposition principle shows how to handle forcing terms which are 
a linear combination of forcing terms covered in [4.3.3] 


Superposition Principle 4.3.6. Suppose ys(t) is a particular solution to 
y" +plt)y’ +a(t)y = f(t) 
and yg(t) is a particular solution to 
y" +plt)y’ +a(t)y = g(t). 
Then for a, 8 € R, the function y(t) := ays (t) + Byg(t) is a solution to 
y" +p(t)y’ +a(t)y = af(t) + Bg(t). 
Here is an example of the Superposition Principle in use: 
Example 4.3.7. Find a particular solution to 
y +2y'+2y = 2+ cos2t. 


SOLUTION. We have two forcing terms here, f(t) := 2 and g(t) := cos 2t. We need 
to handle each one separately. 

First we will find a particular solution y(t) to 

y +2y’+2y = 2. 
Since f(t) = 2 is a degree 0 polynomial, the trial solution is ys(t) = ao, also a 
degree 0 polynomial. Plugging this in to the LHS and equating this to the RHS 
yields: 
yp + 2y + 2y¢ = Zag = 2. 

Thus we find that aj = 1 and thus y;(t) = 1 is a particular solution. 

Next we will find a particular solution y,(t) to 

y” + 2y' + 2y = cos2t. 

Since g(t) = cos 2t, the trial solution is y,(t) = acos 2t + bsin 2t. First note that 


y(t) = —2asin 2t + 2bcos 2t 
y(t) = —4acos 2t — 4bsin 2t 


Plugging this into the LHS and equating it to the RHS yields: 
yg +2y/,+2yg = (—4a cos 2t—4b sin 2t)+2(—2a sin 2t+2b cos 2t)+2(a cos 2t+b sin 2t) 
= (—2a + 4b) cos 2t + (—2b — 4a) sin 2t = cos 2t 
This yields the system: 
—2a+4b = 1 
—4a—2b = 0 
We can find a,b by Gaussian Elimination: 
B —4 4 to RREF 0 0 | 1/6 | 
ee ae 0 1] -1/3 
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Thus we find that a = 1/6,b = —1/3, and so 

cos 2t — 2 sin 2t 

———_ 

We conclude that a particular solution to the original differential equation is: 

cos 2t — 2 sin 2t 
G ; 


Yg(t) = 


Yp(t) = ys) + yg(t) =14 


4.4. Variation of parameters 


In this section we introduce a method of finding a particular solution to an inho- 
mogeneous equation 

y" + plt)y’+a(t)y = g(t) 
provided we already know a fundamental set of solutions y(t), yo(t) to the associ- 
ated homogeneous equation: 


y” + p(t)y’+a(t)y = 0. 


The method essentially will rely on the following fact about 2 x 2 systems of equa- 
tions (which will be justified in the next chapter): 


Fact 4.4.1. Suppose a, b,c, d,e, f € R are numbers such that 


W := det |: “ = ad—bc F 0. 


Then the system 


ax+by = e 
co+dy = f 
has the unique solution: 
de — bf —ce+af 


Variation of Parameters 4.4.2. Suppose y1(t),ya(t) is a fundamental set of 
solutions to: 


y" + p(t)y’+a(t)y = 0, 
(in particular, W(t) := yiys—yay, #0 for allt). Then the inhomogeneous equation: 
y" +plt)y’ +a(t)y = g(t) 
has the following as a particular solution: 
—ya(t)g(t) dt pags dt 
t) = ame aes sea cB sly 
PROOF. We know that 
y(t) = Ciyr(t) + Coye(t) 
is the general solution to the homogeneous equation. The idea is to replace the 
constants C), C2 with unknown functions v,v2 and look for a particular solution to 
the inhomogeneous equation of the form 
Yop = Viy1 + Vv2Yye. 


First we compute the first derivative of yp: 


Yo = VUiyh + VyYi + Vayy + vgy2 = (Uryy + vays) + (iy + voy2) 
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Ideally, we do not want to deal with any second-order derivatives of v, and v2, 
since otherwise we would not be making our lives any easier. Furthermore, in some 
sense requiring y, to be a solution to the inhomogeneous equation places only one 
condition on the two unknown functions v,,v2, thus we have some “freedom” to 
impose a second condition in case it helps. Thus, we will additionally assume: 


(A) Vivi tvsy2 = 0. 
Now we compute the second derivative of yp: 
/ 
Up = [(viyt + vaya) + (visa + voy2)] = vig! + ory + vayd + vdup 
a 
=0 
Next we plug y, into the LHS of the differential equation and simplify. Note that: 
Yo + PYp + Wp = Viyy + vpyh + vayy + vQy2 + P(viyy + v2y2) + q(viyn + v2¥2) 
= vi(yy + py, + ay) + v2(yy + pys + qy2) 
+ UY + V2¥2 
= vy + V2¥2, 


because y1, y2 are solutions to the homogeneous equation. Setting LHS equal to 
RHS yields: 


(B) vy + 2¥2 = g(t). 
Now, we combine (A) and (B) into a single system in the unknown “variables” 
U4, Vd: 

yr, + y2vg = 0 

Vii + yar = g(t) 


Since y1, y2 is a fundamental set of solutions, we see that 


wo = det ey eo] #8 


for every t. Thus by Fact |4.4.1} we get 


» _ —y2(t)g(t) » _ yalt)g(t) 
aes 7 and = Wi) 


Finally, v1, v2 are obtained by integrating v},v4 with respect to t. 


Example 4.4.3. Find a particular solution to the inhomogeneous equation 
y’+y = tant 
on the interval (—1/2, 7/2). 


SOLUTION. First we find a fundamental set of solutions to y”+y = 0. Note that the 
characteristic polynomial is f(A) = A? + 1 = (A—1)(A +7). Thus Ai, A2 = +i, and 
so a fundamental set of solutions is y;(£) = cost, y(t) = sint. Next we compute 
the Wronskian: 


W(t) = det i 7 = cost(cost) — sint(—sint) = cos*t+sin?t = 1. 
1 Yo 
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Next, we get v1: 


—yo(t)g(t) at 
UAE) / Wit) 


= J -sinttaneat 
- 2 
sin* t 

-{ dt 
cost 


2 
cos* t — 1 
= - {a 
cost 


= sint —In|sect + tant| 
= sint — In(sect + tant) 


I 


since sect + tant > 0 on the interval (—7/2,7/2). Next we get vo: 


= / yi(t)g(t) dt 
W(t) 


= [costtantat 


= [out 


= —cost. 


v2 (t) 


We conclude that a particular solution is: 


yp(t) = yivi +y2v2 = cost(sint — In(sect + tant)) + sin t(— cost) 


= —costln(sect + tant). 


Example 4.4.4. Find a particular solution to: 
ty” +ty’—y = tint, 
where y;(t) =t and yo(t) = 1/t is a fundamental set of solutions to 
Py” +ty’—y = 0. 


ProoF. In order to use Variation of Parameters, the coefficient of y” in the inho- 
mogeneous equation needs to equal 1. Thus we will divide the equation through by 
t? to obtain: 

Wo on y Int 


OP cvige? iat op 


Thus g(t) = (Int)/t. Furthermore, since the differential equation only makes sense 
on the interval (0, +00), this is where we will work. Next, we need to compute the 
Wronskian of the two fundamental solutions: 


W(t) = det Y Z| = ¢(-1/#)—(1/t):1 = i -7 5a) 352) 
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Next we compute v;(t): 
_ ff =ye(t)g(t) at 
n= | 
_ f—G/)dnt)/t 
= i yet 


1 flint 
= — | —d 
ofa 

(Int)? 

c= 


ff yr(t)g(t) dt 
vat) = / Wit) 


_ f t(nt)/tdt 
=| —2/t 


1 


t?(2Int — 1) 
F 


We also compute v2(t): 


We conclude that a particular solution is: 
t(Int)? ¢(2Int—1) 
4 8 


yp(t) = yiuityve = 
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CHAPTER 5 


Linear algebra II 


We have already seen in Chapter |1] that the device of augmented matrix is very 
useful for systematically solving systems of equations. For the next step in our 
linear algebra journey, we will treat matrices as a fundamental object of interest in 
their own right and work with them almost exclusively. 


5.1. Matrices and vectors 


Definition 5.1.1. Suppose m,n > 1. A matrix (of size m x n) is a rectangular 
array of real numbers with m rows and n columns: 


Q11 a12 Gin 

a21 a22 a2n 
A= . 

Aml1 aAm2 con Amn 


Sometimes we abbreviate a matrix by writing: 
A = (dij)i<i<maisj<n or just A = (aij) 


if the size of the matrix A is clear from context. Given i € {1,...,m} and j € 
{1,...,n}, the number a,; is called the (i,7)-entry (or component) of A. We 
denote the set of all m x n matrices (with real numbers as entries) by Matmx(R). 


A matrix in Mat,,,1(R) with only one column: 


is often called a column vector. We will often denote Mat,,..1 by R™, and write 
column vectors with bold letters a,b,c, x,y,z, etc. 


For each m,n > 1, we define the zero matrix in Mat,,.,(R) to be the m x n 
matrix where every entry is = 0: 


0 0 0 

0 0 0 
Omxn — 

OO is © 


Sometimes we will denote 0,,., as just 0 when it is clear from context that we are 
talking about the m x n zero matrix (and not, for instance, the number 0 € R). 
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A matriz is, in a certain sense, a vast generalization of a number. Just as we 
can add, subtract, multiply, and divide numbers, we can sometimes do versions of 
these things with matrices. Here are the most fundamental operations defined for 
matrices: 


Definition 5.1.2. Fix m,n > 1. Given two matrices A, B € Matm (IR), we define 
their matrix sum A+ B € Matm xn(R) to be the m x n matrix whose (i, j)-entry 
is aij + biz; Le., 


Q11 a12 °"° Qin bi1 bigot Din 
a21 agg 0-"° a2n ba1 bon + bon 
A+Be= . : . : + 
Am1 aAm2 aa Amn bmi bm2 oe bmn 
ay + by4 a2 + bi2 ae Gin + bin 
az1 + b21 22 + bog ae Gan + ban 
Qm1 + Omi Am2 1 bm2 cs Amn binn 


Furthermore, given a € R, we define the scalar multiple of A by a to be the 
matrix aA € Matmxn(R) whose (i, 7)-entry is aa;,;, iLe., 


Q11 a2 -"° Gin Qaai1 Qai2 0 °°" Aain 
a21 ag2 °°" a2n aa21 aa22 °°" Aa2n 
aA=a = 
aml aAm2 a Amn Aam1 AAm2 aie AAmn 
Example 5.1.3. (1) Here is an example of how matrix addition works (for 
matrices in Mat3~2(R)): 
1 2 1 1 2 3 
3 4/4 4/2 3} = 15 7 
5 6 5 8 10 14 


(2) Here is an example of how scalar multiplication works (for column vectors 
in R*, which is the same thing as matrices in Mat4,1(R)): 


0 0 
1 3 
Oo; | 0 
-1 —3 


Fact 5.1.4. Suppose m,n > 1, A,B,C € Matmy,(R), and a, € R. Then the 
following fact] about matrix addition and scalar multiplication hold: 
(1) (A +B)+C=A+(B+C) (associativity of addition) 
(2) Omxn +A=A+t0mxn = A (additive identity) 
(3) A+ (-1)A = 0mxn (additive inverse) 
(4) A+B=B+A (commutativity of addition) 
(5) a(A+ B) =aA+aB (right distributivity) 
(6) (a+ 6)A=aA+ BA (left distributivity) 


lThese facts say that the set Matm xn(R) equipped with matrix addition and scalar multi- 
plication is a vector space over the real numbers R. 
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(7) (aB)A = a(GA) (associativity of scalar multiplication) 
(8) 1-A=A (here 1 € R is a scalar) 


Definition 5.1.5. Suppose n > 1. A linear combination of column vectors 
Vi,---;Vm € R” is an expression of the form: 


Q1V1 + A2V2 + °°* + AmVm; 
where Q1,Q@2,..-,Qm € R are scalars. 
5.2. Matrix equations 


Matrices and vectors give us a superior way of writing and talking about system of 
equations: 


Q4121 +. 442%Q2 +++++ Ann = by 
A21L1 + 492%2 +++ + 9ntn = by 
Ami t1 + Gm2%2+-+++amntn = bm 


In order to make sense of this, the first step is to define the product of a matrix 
with a column vector: 


Definition 5.2.1. Suppose A € Matm.n(IR) and x € R”. We define the product 
to be the column vector Ax € R™ whose (i, 1)-entry is 


(Ax)i1 = S> Ante 
k=l 


where 
ai1 OTD. 6 pst" Qin XY 
a21 a22 °°" a2n x2 
A= . . ; . and x= 
aml Am2 pare Amn In 


Another way to say this: we can write the matrix A as a collection of n column 
vectors in R™: 
A= [ar az -:: an] 
Then the product Ax is defined to be the linear combination: 
AX := wa, +@2a2 +--+ + Ena. 


Written yet another way, this is: 


a41 412, + Athy Ly 4121 + A42%2 + +++ + Ain&n 

a21 422, ++ Aan LQ 2121 + A22%2 + +++ + A2nkn 
Ax = ; j ; . 24 <= 

Am1 aAm2 it Amn In Am121 + Am2Xr2 Sea Aamntn 


Here is an example of a product of a matrix with a column vector: 


Example 5.2.2. Consider the matrix and column vector: 


| a a is 
a=[; _9 4 and x = : 


1 
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Then the product Ax is: 


— ; . 4] 2 7 ES Reed = i 


Warning 5.2.3. In order for the product of a matrix A and a column vector x to 
be defined and make sense, the number of columns of A needs to equal the number 
of rows of x. Otherwise, the product Ax is not defined and thus does not make 
sense. For example, you can multiply a 2 x 2 matrix with a 2 x 1 column vector, 
but you cannot multiply a 2 x 3 matrix with a 2 x 1 column vector. 


Here are some basic facts about matrix multiplication which we will use: 


Fact 5.2.4. Suppose A € Matmyn(R), x,y € R", and a € R. Then: 
(1) A(ax) = aAx 
(2) A(x+y) = Ax+ Ay. 


Systems of equations. Now, we can interpret a system of equations: 


04121 +. 442%Q2 +++++ Ann = by 
49121 + 99%g + +++ + Gant = bg 

(5.1) 
Am1®1 + Am2%2 +++ +4mntn = bm, 


as a matrix equation: 


(1) First, we define the coefficient matrix to be the m x n matrix A € 
Matmxn(R) defined by: 


@11 12 Gin 
a21 422 a2n 
A: . 
Qm1 Am2 veke amn 
(2) Second, we combine our unknown variables 1,..., 2, into a single vector 


of unknowns x (of size n x 1): 


In 


(3) Third, we combine our right-hand side parameters b,,...,b;, into a single 
column vector b € R™”: 


by 


(4) Finally, we can translate the system (5.1) into the matrix equation: 
(5.2) Ax =b 
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If we write A = [ar a2 --: a,| in terms of its column vectors, then 
we can also express the equation (5.2) as: 
Taj, + %eaq+e+ + Fpan = b. 


We could also express (5.2) by writing everything out fully: 


ai1 a2 -"° Gin eal by 
agi a2 ""° a2n v2 bo 
aAm1 aAm2 peeve Amn In bm 


There are advantages and disadvantages to each choice of notations, al- 
though ultimately these are just equivalent ways of rewriting (5.1) in terms 
of matrices and vectors. 


Remark 5.2.5. When it comes to solving matrix equations Ax = b, everything 
from Chapter[I|applies. For instance, suppose we wish to solve the matrix equation: 


Pael-f 


To solve this, we set up the corresponding augmented matrix and take it to RREF: 


1 2 0 R2-3Ri-> Re 1 2 0 
3 4/1 QO -2)1 
—3R2>R2 1 2 0 
0 1] -1/2 
Ri-2R2>R, {1 0 1 
0 1] -1/2 


and we see that (x1, £2) = (1, —1/2) is the unique solution to the system of equation. 
In other words, the column vector 


_ i] 


is the unique solution to the matrix equation 


1 2 Ly _ 0 
3°«4 v2 1 
Indeed, note that the product 


1 2 1 — {1€1) + 2(-1/2)} _ J 0 
3. 4} }-1/2} ~~ |38(4) + 4(-1/2)} ~~ 1 
gives the correct right-hand side of the equation. 


5.3. Nullspace, linear independence, and dimension 
In this section we will dive deeper into important features of a matrix equation: 
Ax = b 


Ultimately, we will be establishing important definitions and basic properties in- 
volving these definitions, in order to better understand how the solutions to a matrix 
equation look and behave. Since we already learned how to completely solve matrix 
equations (disguised as systems of equations) in Chapter |1| there will not be any 
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new computational methods in this section, however, we will repurpose the method 
of Gaussian Elimination to answer many more types of questions related to matrix 
equations and their solutions. 


There will be a strong analogy between the nature of solutions to a matrix equation 
and the nature of solutions to a linear differential equation (since both are secretly 
applications of abstract linear algebra). The first similarity already shows up in the 
following definition: 

Definition 5.3.1. Suppose A € Mat,,.»(R) and consider the matrix equation 
(5.3) Ax = b 

where b € R™. We say that the equation (5.3) is homogeneous if b = 0m 1 is 


the zero vector in R™. Otherwise, if b 4 0, then we say that the equation (5.3) is 
inhomogeneous. 


We will first be interested in studying homogeneous matrix equations. In this 
context, the following is a very important definition: 


Definition 5.3.2. Suppose A € Matm.xn(IR). We define the nullspace of A to be 
the following subset of R”: 


null(A) := {x € R”: Ax =0} C R” 


In other words, the nullspace null(A) of the matrix A is the set of all solutions to 
the homogeneous equation Ax = 0. 


From Chapter [1] we already know how to compute the nullspace of a matrix: 


Example 5.3.3. Find the nullspace of the following matrix: 


1104 
#5 oa al 


Proor. We need to find the set of all vectors x € R* such that Ax = 0. This 
means the same thing as finding all solutions to the system of equations: 


@+22+4r7, = 0 


r3+2¢4 = 0. 
To do this, we set up the system as an augmented matrix and take it to RREF: 
1 10 4/0 
0 0 1 2)/0 


Here we see that the augmented matrix is already in RREF, so we can read off the 
solutions. We see that x2, x4 are free variables, so the general solution is: 


Uy = —% — 4x4 
t2 = 

3 = —2%4 

G4 = @4 


Which we can write in parametric form as a set of linear combination of R+-vectors: 


—1 —4 
1 0 

null(A) = 4 x2 g | + #4] _9] +2224 € R 
0 1 
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The nullspace of a matrix is also closed under linear combinations (analogous to 


Proposition (4.1.5): 


Proposition 5.3.4. Suppose A € Matmyn(R). Let x9,x, € null(A), a € R be 
arbitrary. Then: 

(1) 0 € null(A), where 0 = On 1 is the zero vector in R”, 

(2) x+y € null(A), and 

(3) ax € null(A). 
[In linear algebra terms, this says that null(A) is a subspace of R”.] 
ProoF. (1) Let 0 = 0,1 be the zero vector in R". Then by Definition [5.2.1] it 
follows that A0nx1 = Omx1. Thus 0n%1 € null(A). 

(2) Note that 


A(x+y) = Ax+ Ay 
= Omx1+0m x1 since x,y € null(A) 
Omx1 


and thus x + y € null(A). 
(3) Note that 


A(ax) = aAx 
= a0mx1 since x € null(A) 


= Omx1- 


Thus ax € null(A). 


Next, we want to say a few words about how to efficiently describe a nullspace. We 
now define a notation which allows us to describe a large set of vectors in R”: 


Definition 5.3.5. Suppose x1,...,x, € R”. The span of x;,...,x, is the set of 
all linear combinations of x1,...,Xx: 


span(x1,...,X«) = {a1X1+-+-+aKX~ : @1,...,a% € R} 
In other words, the span of x;,..., xX, is the set of all vectors which can be “created” 
from X1,...,Xk- 


Example 5.3.6. Here are some common usages of span: 
(1) We can describe R? as a span, in multiple different ways: 


at = sows (9) -[2]) = = ([0)- [La] -[4]) = = (L)-Ea)) 


(2) We can describe R? as a span: 


1 0 0 
R? = span | JO| ,}1] , 0 
0 0 1 


(There are infinitely many other ways to describe R? as a span). 
(3) Returning to Example above, we found that 


—1 —4 
null(A) = 4 x2 : +4 |_| + 2,84 € R 
0 1 
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Another way of writing this in terms of span: 


—1 —4 
1 0 
0 |? |-2 
0 1 


null(A) = span 


In some sense, it is better to express: 


v= me({ ff) = © -=e(() 
= om() AED 


The reason is because in this last description, two of the four vectors are redundant. 
For instance, the third and fourth can already be written as linear combinations 
of the first and second vectors, and vice-versa. The next concept we will introduce 
is “non-redundancy”, better known as linear independence (compare to Defini- 
tion also recall our earlier statement: the notion of linear independence is 
one of the most important definitions in undergraduate mathematics): 


instead of 


Definition 5.3.7. Suppose x1,...,x, © R”. We say that x),...,x, are linearly 
independent if for every cy,...,c, € R, if cyx1+---c,x, = 0, then c) = cg =--- = 
ce = 0. In other words, x1,...,x, are linearly independent iff the homogeneous 
matrix equation 


Ac = 0 where A= [x Xg ocr: Xk] 


has exactly one solution, c = On 1. 


Otherwise, we say that x,...,x, are linearly dependent. In other words, 
X1,---,Xx are linearly dependent if there exists c,,...,7,% € R” such that c; 4 0 for 
at least one 7 € {1,...,k}. In this case, the linear combination cyx; +--+ chx~ = 0 


is called a nontrivial dependence relation. 


We can use Gaussian Elimination to check if a collection of vectors is linearly 
(in)dependent: 


Example 5.3.8. Here is an example of linear independence and linear dependence. 
(1) The vectors 


1] f2] [3 
2|,/41,} 9 
3] [8] |27 


are linearly independent. Why? We need to show that the the equation 
c1X1 + CoX2 + ¢3X3 = 03x71 has only one solution (c1,c2,c3) = (0,0,0). 
This is equivalent to showing that the system of equations: 
C1 + 2c3 + 8c3 = 0 
2c, + 4c2 + 9c3 0 
3c] + 8c2 + 27c3 = 0 
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has a unique solution. To see this, we set up an augmented matrix and 
take it to RREF: 


ia #9 1 0 ojo 
2 AO \ Gi) BEE. Io: 4 0. 
3 8 27/0 00 1/0 


Since every variable is a pivot variable and there are no free variables, we 
see that there is a unique solution, which must be (ci, cz, c3) = (0,0, 0). 
(2) The vectors 


a fs) iS 
1 i er ee 
2| | 4 0 


are linearly dependent. Why? We need to find a nontrivial dependence 
relation between these three vectors which is equivalent to finding a non- 
trivial solution to the following system of equations: 


3c2 — 3c3 = 0 
Cy —Co +3c3 = 0 
2c, +4c. = 0 
To do this, we set up an augmented matrix and take it to RREF: 
0 3 -3/0 10 2 40 
ta Bg): ee Nie a, 
2 4 O |}0 0 0 01/0 
We see that cg is a free variable and so the general solution is: 
cy = —2c3 
C2 = 63 
c3 = C3 


which we can write in parametric form: 
—2 

cg | 1 :cg3 ER 
1 


To get a nontrivial solution, we can choose, for instance, c3 := 1 to get the solution 
(c1, €2,¢3) = (—2,1,1). This gives us a nontrivial dependence relation: 


0 3 “ 0 
(975/40) ae atl ae | S le lig 
: 4 0 0 


We conclude these three vectors are linearly dependent. 


Remark 5.3.9. (1) The empty set @ of vectors in, say, R” is considered to 

be linearly independent. This corresponds to k = 0 in Definition [5.3.7| 

(2) Suppose & = 1 in Definition [5.3.7 This means we have a set of one vector 
x, € R”. Then x, is linearly independent iff x; 4 0,1. This is because 
the linear combination c,x; = Om x1 requires either cy = 0 or x1 = Om x1 
in order to be true. If x; #4 Om x1, then necessarily cy; = 0. 

(3) For two vectors x,,xX2 € R”, x1, X»2 are linearly dependent iff there exists 
a € R such that either x; = ax or X29 = aX}. 
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For three or more vectors, linear dependence does not mean “one vector 
is a constant multiple of one of the others”. For instance, the follow three 


vectors in R?: 
1} JO} J 
O;’> yi}? yd 
are linearly dependent because we have a nontrivial dependence relation: 
1 0 1 0 
val +e-f[+en- Gi = fi 


even though none of the three vectors is exactly a multiple of the other 
two. 


If x1,...,X, in R” are linearly independent, then for 0 < k, the smaller 

collection x,,...,xg is automatically linearly independent. 

If x, ...,xx in R” are linearly dependent, then any larger collection 
K1,-+-,Xk,Xk41,---,Xk+e 


is automatically linearly dependent. 

If x,,...,xX, are & distinct vectors in R”, and n < k, then necessarily 
X1,.-.-,Xx are linearly dependent. This is because the corresponding ho- 
mogeneous matrix equation 


Ac = Onx1 where A = [x, x2 -+: Xp] 


corresponds to an n x k matrix A with more columns than rows, so there 
is guaranteed to be at least one free variable (hence infinitely many solu- 
tions). 


Combining the notions of span and linear independence, we arrive at the notion of 
basis and dimension (of a nullspace): 


Definition 5.3.10. Suppose A € Matm y,(R). A basis of null(A) is a collection 


(1) 


of vectors x,,...,X, € R” such that: 
null(A) = span(x1,...,Xx) (so X1,...,, can make all of null(A) by linear 
combinations), and 
X1,.-.,Xx are linearly independent (so none of the vectors x1,...,x, are 


(2) 


unnecessary or redundant). 


We define the dimension of null(A) to be the number of vectors in a basis of 
null(A). Thus 


dimnull(A) := k <> _ there is a basis x),...,x, of null(A) with & vectors 


Example 5.3.11. Returning to Example |5.3.3| we already saw that: 


Since the vectors 


—1 —4 
0 
null(A) = span 0 |>l_2 
0 1 
—1 —4 
1 0 
0}? |-2 
0 1 


111 


5.3. NULLSPACE, LINEAR INDEPENDENCE, AND DIMENSION 103 


are linearly independent, we conclude that 


—1 —4 
1 0 
0}? |-2 
0 1 


is a basis of null(A) and thus dim null(A) = 2. 
Here are some general facts to know, which we state without proof: 


Fact 5.3.12. Suppose A € Matyy.n(R) 


(1) In general, null(A) will have infinitely many possible bases, but all of 
these bases have the same size. Thus the definition of dim null(A) does 
not depend on a particular choice of basis. 

(2) Recall that the rank of A (denoted rank(A)) is the number of pivots in 
the RREF of A. In general, dim null(A) is equal to the number of free 
variables in the RREF of A. Since the number of pivot variables plus the 
number of free variables, this yields the important rank-nullity formula: 


rank(A) + dimnull(A) = n = # of columns of A. 


(3) A basis for null(A) can be obtained by solving the homogeneous equation 
Ac = Om x1 in the usual way with Gaussian Elimination, writing the 
solutions in parametric form with the free variables as parameters, then 
collecting each vector which gets multiplied by a free variable. This (finite) 
collection of vectors will be a basis for null(A). 


Finally, we have the following fact for inhomogeneous equations (analogous to The- 
orem |4.3.1): 


Proposition 5.3.13. Suppose A € Matm yn(R) and b € R™, and assume b # 
Omx1i- Consider the inhomogeneous equation: 


(t) Ax = b 


Suppose we have one particular solution xp, € R” to ({). Then the set of all solutions 
to ({) ts: 

{Xp +Xp : Xn € null(A)} 
In other words, every solution to (+) is equal to our particular solution x, plus a 
solution x;, to the homogeneous solution Ax = Om x1- 


Proor. Let x, € null(A) and x, be our particular solution to (t). Note that: 
Alty +e) = Axp + Ax, 
= b+0mnx1 
= b. 
Thus x,+xz, is also a solution to ({). Conversely, suppose x; is an arbitrary solution 
to ({). Note that 
A(X; -—Xp) = Ax; — Ax, 
b—b 


= Omx1- 


Thus x; — x, € null(A), so there is x, € null(A) such that x; — x, = x,. Thus 
Xi = Xp t+ Xn. 


112 


104 5. LINEAR ALGEBRA II 


Here are some more facts about the number of solutions of a matrix equation in 
terms of the terminology from this section: 


Fact 5.3.14. Suppose A € Mat (IR) and b € R™. 


(1) The following are equivalent: 
(a) there does not exist any solutions to Ax = b, 
(b) the system corresponding to Ax = b is inconsistent, 
(c) there does not exist a particular solution x, to Ax = b. 


We define the matrix equation Ax = b to be inconsistent if any of the equivalent 
conditions of (1) above. We say that Ax = b is consistent otherwise. 


(2) Suppose Ax = b is consistent. The following are equivalent: 
(a) there is a unique solution to Ax = b, 

) there is a unique solution to the system corresponding to Ax = b, 
(c) null(A) = {0x1}, 
(d) dim null(A) = 0, 
(e) there are no free variables, 
(f) every variable is a pivot variable, 
rank(A) =n. 
(3) Suppose Ax = b is consistent. The following are equivalent: 
(a) there are infinitely many solutions to Ax = b, 
(b) there are infinitely many solutions to the system corresponding to 
Ax =b, 
null(A) £ {Ont}, 
dim null(A) > 1, 
there is at least one free variable, 
rank(A) <n. 


— 
oe 
wa 


Thus, the distinction between 1 solution versus infinitely many solutions to Ax = b 
is entirely determined by null(A). 


5.4. Square matrices and determinants 


In anticipation of Chapter |6| in this section we take a closer look at square matrices. 


Definition 5.4.1. We call a matrix A a square matrix if A € Mat,.»(R) for 
some n> 1. 


Example 5.4.2. (1) Here are some square matrices of various sizes: 
1 2 3 4 
Ay : i _—2 5 6 7 8 
3°44 7 8 9 9 10 11 12 


13 14 15 16 


(2) Suppose n > 1. We define the identity matrix to be the square matrix 
I =Inxn € Matnx»(IR) which has 1’s on the main diagonal and 0’s in all 


other entries, i.e., 
1 ifi=j 
Dig = side ga 
0 ift#7 
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Written out, the identity matrix looks like: 


iL. He ae iG 
0. Dox. B 
Inxn = 


The special property of the identity matrix is that for every v € R”, 
InxnV = Vv, i.e., multiplying a column vector by an appropriately-sized 
identity matrix always returns the original vector. 


We are interested in answering the following question about square matrices: 


Question 5.4.3. Given a square matrix A € Matny,(R), how can we tell if 
null(A) 4 {0}? Put another way, how can we tell if there exists v € R” such 
that Av = 0 but v £0? 


Of course, one way to answer Question [5.4.3] is to just compute a basis for null(A), 
and then check whether the basis is empty or nonempty. However for our purposes 
this method is way too cumbersome (in the next section, we will ask this question 
for an infinite family of matrices simultaneously). Fortunately, there is a much 
easier way to answer this question: with determinants. 


Suppose A € Mat,.»(IR) is a square matrix. Then associated to A is a number 
det(A) € R called the determinant of A. In other words, there is a function: 


det : Matnxn(R) > R 


We will not carefully define this function, but we will give the formula for how to 
compute it. For this class, ultimately we will treat the determinant as a black-box 
and take on faith all of its relevant properties. 


Computing the determinant. For n = 1, computing the determinant is 
easy: 
Given A = [ais] € Mat,,1(R), we have det A = ay. 
For n = 2, there is also a fairly simple formula for computing the determinant: 


Given A = i | € Matex2(R), we have det A = a11Q@22 — 21412. 
G21 422 
Now suppose n > 2, and let A € Matyy,(R). Then for any i,j € {1,...,n} 
we define the 7j-cofactor matrix of A to be the matrix Ajj € Mat —1)x(n—1)(R) 
obtained from A by deleting the ith row and the jth column. Then we can compute 
the determinant of A by cofactor expansion: 
n 
det(A) = S°(-1)'7 Aj; -det(Aij;) for any 1 <i<n, 
j=l 
i.e., we can use cofactor expansion along any row, not just the top row i = 1. Simi- 


larly, we can use cofactor expansion along any column to compute the determinant 
of A: 


n 
det(A) = S°(-1)'7 Aj; -det(Aij) for any 1 <j <n. 
i=1 
Note that the cofactor expansion formulas reduce the computation of the determi- 
nant of an n x n matrix down to the computation of several (n — 1) x (n— 1) sized 
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determinants. Applying cofactor expansion recursively, eventually the computation 
will reduce to 2 x 2 or 1 x I-sized determinants, which we know how to compute 
directly from above. 


Example 5.4.4. Consider the 3 x 3 matrix 


1 3 -83 
A= —3 —-5 2 e Mats,.3(R). 
-4 4 —-6 


We will calculate the determinant using cofactor expansion along the Ist row (i = 


1): 


1 30-3 
det }-3 —5 2 = (1)! Ap det(An) + Ci) Ape det Aig) 
-4 4 -6 
+ (—1)'*7 Aj det(A13) 
—5 2 -3 2 —3 -5 
— act | i — 3det Be “4 — 3det a 7 
= [(-5)(-6) — 2-4] — 3[(—3)(-6) — 2(-4)] 


3[(—3)4 — (-5)(—-4)] 
= 22—3-26 —3(-32) = 40. 


In general, when using cofactor expansion to compute determinants, it helps to 
judiciously pick a row or a column that has many zeros, if there is one. 


Properties of the determinant. The determinant gives us an answer to 
Question |5.4.3} 


Determinant Property 5.4.5. Suppose A € Matny,(R). Then the following are 
equivalent: 

(1) det(A) 40 

(2) mull(A) = {0}. 
In other words, to check if the nullspace has a nontrivial vector in it, just compute 
the determinant and check if it is 4 0 or = 0. As it turns out, the Determinant 


Property is really the only thing we need to know about determinants going 
forward. 


Nevertheless, here are some other properties of the determinant which might be 
useful for computing determinants: 
Fact 5.4.6. Suppose A, B € Matnx,(R) and a € R. Then 
(1) detUnxn) =1 
(2) det(aA) =a"A 
(3) if B is obtained from A by either switching two rows or switching two 
columns (but not both), then det(B) = — det(A). 


5.5. Eigenvalues and eigenvectors 


Recall that the identity matrix I has the property that Iv = v for any (appropriately- 
sized) vector v. Another way to say this is that the identity matrix “scales the vector 
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by A= 1”, e.g. 
1 0 O} }1 1 
O 1 O; |2] = 1-42 
0 0 1] ]3 3 


Likewise, the matrix aI will scale a vector by \ = a, e.g. 


2 0 Of fl 2 1 
O 2 Of} }2} = J4} = 2- }2 
0 O 2} {3 6 3 


Along similar lines, a diagonal matrix will scale certain vectors, but possibly with 
different scaling factors depending on the vector: 


1 0 OF {1 1 1 
O 2 O; jO; = JO; = 1-10 
0 0 3] {0 0 0 
1 0 O; JO 0 0 
O 2 OF Jl} = J2]) = 2-]1 
0 0 3] {0 0 0 
1 0 O; JO 0 0 
0 2 O; jO; = JO} = 3-10 
0 0 3] {1 3 3 


For this reason, diagonal matrices are often very nice matrices to work with (in 
general the operation of scaling is computationally easier than the operation of 
matrix multiplication). 


The concepts of ezgenvalue, eigenvector, eigenspace, and eigenbasis will allow us to 
treat any square matrix almost as if it were a diagonal matrix. 


Definition 5.5.1. Suppose A € Mat,.,,(IR) is a square matrix and 4 € R. We say 
that » is an eigenvalue for A if there exists a nonzero vector v € R” such that 


Av = Mv. 


If A is an eigenvalue of A, then we call a nonzero vector v € R” which satisfies 
Av = \v an eigenvector of A associated to A. 


The goal of this section is to answer the following question: 


Question 5.5.2. Given A © Matnyxn(R), how do we 
(1) find all eigenvalues X of A, 
and for each eigenvalue how do we 


(2) find all eigenvectors v associated to X? 


The answer to Question 1) actually follows quite nicely from the Determinant 
Property Indeed, suppose A € Mat,,.,(R), 4 © R and note that we have the 
following equivalences: 
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X is an eigenvalue of A 


there exists nonzero v € R” such that Av = Av 


there exists nonzero v € R” such that Av — Av = 0 


there exists nonzero v € R” such that Av — AIv = 0 


there exists nonzero v € R” such that (A — AI)v = 0 
null(A — AI) # {0} 
det(A — AI) = 0, by the Determinant Property |5.4.5 


ftitd) 


Comparing the first and last part of the equivalence gives us an answer to part (1) 
of our question: 


Eigenvalue Theorem 5.5.3. Suppose A € Matny,»(R) and X € R. Then the 
following are equivalent: 

(1) X is an eigenvalue of A, 

(2) det(A — AI) = 0. 


In other words, the eigenvalues of A are zeros of the “function” det(A — AZ). As 
it turns out, the expression det(A — AJ) is always a polynomial in the variable 4. 
This polynomial has a special name: 


Definition 5.5.4. Suppose A € Mat,.,(R). The polynomial?| 
p(A) := (-1)"det(A— AT) = det(AT — A) 
is called the characteristic polynomial of A, and the equation 
p(A) = 0 
is called the characteristic equation. 


Thus the Eigenvalue Theorem [5.5.3] states that the eigenvalues of A are precisely 
the zeros of its characteristic polynomial. 


Example 5.5.5. Find the eigenvalues for the following matrix: 


4 0 -2 
A= }1 1 2 
0 0 2 


SOLUTION. We will first determine the characteristic polynomial of A. Note that 
4—xX 0 —2 
(-1)?det}| 1 1-A 2 
0 0 2—AX 
1-A 2 

= -(4— det | 0 >| +2 

(using cofactor expansion along the top row) 
= —(4—d)(1- (2-2) 
= (A-4)(A-1)(A - 2). 


det(A — XI) 


2The factor (—1)" ensures that the polynomial is monic. 
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Thus we get three distinct eigenvalues: A; = 1, Ag = 2, and A3 = 4. 


Next we turn our attention to finding eigenvectors corresponding to a particular 
eigenvalue. Suppose is an eigenvalue of A. We already saw that an eigenvector v 
is a nonzero vector such that (A — AI)v = 0. Thus v € null(A — XZ). In fact, every 
nonzero vector in null(A — AZ) is an eigenvector associated to A. This motivates 
the following definition: 


Definition 5.5.6. Suppose A € Mat,.,(R) and A is an eigenvalue of A. We define 
the eigenspace of \ to be 


Ey := null(A— Al), 


i.e., the eigenspace EF is the set of all eigenvectors associated to \ together with 
the zero vector] 


Since an eigenspace is a nullspace, we know how to find a basis for it: 


Example 5.5.7. Find all eigenvectors of the matrix 


4 0 -2 
A= |1 1 2 
0 0 2 


SOLUTION. In Example[5.5.5| we found three distinct eigenvalues: \, = 1, A2 = 2, 
and A3 = 4. For each of these eigenvalues, we need to compute a basis of its 
eigenspace. 

(Ai = 1) We will compute a basis of 


3 0 -2 
null(A—J) = null/l1 O 2 
00 1 
Note that 
3 0 -2/0 1 0 0j0 
{oO 2 |0) 2 le oslo 
00 1/0 0 0 0/0 
We see that x2 is a free variable and thus the general solution is: 
ty = 0 
t2 = X 
3 = 0 


Thus we can express the eigenspace E) as 
E, = null(A—J) = span | |1 
0 


(Ag = 2) We compute a basis of null(A — 2): 


2 0 -21/0 ‘: 1a 
ie Bs gy SE oe SB 
0 0 0 |0 00 0 fo 


The zero vector is always included in every eigenspace, although the zero vector is never 
’ 
considered an eigenvector. 
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We see that x3 is a free variable and the general solution is 


t= % 
2 = 323 
L3 = @3 


Thus we can express the eigenspace E2 as 


1 
Ey = null(A—2/2) = span {| {3 
1 
(A3 = 4) We compute a basis of null(A — 4): 
0 O —-2)0 1 -—3 0] 0 
1 =3 2/0) 22 jo o Jo 
0 O -2)0 0 0 O};0 


We see that x2 is a free variable and the general solution is 


Ly = 322 
tw = @ 
v3 = 0 


Thus we can express the eigenspace Ey, as 


E, = null(A—4I) = span | | 1 


We have one final definition: 


Definition 5.5.8. Suppose A € Mat,x,(R) is a square matrix. An eigenbasis of 
A is a basis of R” which is composed of eigenvectors of A. In other words, a set of 


vectors V1,...,Vn € R” is an eigenbasis of A if 
(1) Av; = A;v; for some Aj, for each i = 1,...,n. 
(2) R” = span(vj,..., Vn) 
(3) vi,.--,Vn are linearly independent. 


Here is a fact about eigenbases which we are happy to assume: 
Fact 5.5.9. Suppose A € Mat,,.,(IR) has distinct eigenvalues A1,..., Az, for some 
k<n. If 


(1) 6; is a basis of Fy, for each i=1,...,k and 

(2) [P1| + [G21 +-+-+ [Bnl =n, 
then 8 := 6; U Bg U---U Gx is an eigenbasis of A. In particular, if k = n, then 
GB = 6, U---UB» is always an eigenbasis (i.e., condition (2) is automatically satisfied). 


Example 5.5.10. Find an eigenbasis of 


4 0 -2 
A= }1 1 2 
0 0 2 
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SOLUTION. In example we found that E, had basis 


0 
1 
0 


Ey had basis 


and EF, had basis 


Then by Fact the following is an eigenbasis of A: 
0 1 3 


0 al 0 
We conclude this section with a few remarks: 


Remark 5.5.11. Suppose A € Matnx»(R). 

(1) It is possible that an eigenbasis of A does not exist. This can only happen 
when p(A) has repeated roots. We will see what to do in this situation in 
the next chapter. 

(2) It is possible that some of the eigenvalues of A are complex. In this case, 
the corresponding eigenvectors will have complex entries, but otherwise 
everything else is the same. We will see what complex eigenvalues/vectors 
means for us in the next chapter. 

If all the roots of p(A) are distinct and real, then there will be n distinct 
real eigenvalues and thus an eigenbasis will always exist. 


— 
w 
wn 
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CHAPTER 6 


Systems of differential equations 


Up until this point, we have only considered differential equations with one un- 
known function y(t), e.g., 


y! = f(t,y) 
y" +p(t)y’ +a(t)y = g(t) 


Unfortunately, in real world problems you generally have many unknown variables 
you are interested in and you are rarely ever lucky enough to have just a single 
unknown. Therefore, just like with linear equations, we have to consider now 
differential equations with multiple unknown functions which might be entangled 
with each other in various ways. 


In this final chapter, we will study systems of differential equations, i.e., multiple 
equations which relate multiple unknown functions and their derivatives. For the 
sake of time, we will focus on a very special case: homogeneous linear first-order 
systems with constant coefficients. 


6.1. Homogeneous linear systems with constant coefficients 


Here is a typical example of the type of system we will consider: 
Example 6.1.1. What are the solutions to the following system: 


vy = @1 +222 


Ly = 22,4+ 22 


Here, a solution is a pair of functions 21(t),xv2(t) such that when you plug both 
functions in, then both equations are satisfied. One can easily check that the pair 
x, = e-',a. = —e~* and the pair 2; = e*’,x_ = e** are both solutions to the 
system. In fact, we will see that the set of all solutions is precisely the set of all 


linear combinations of these two pairs. 


Our first goal is to learn how to solve systems like the one in Example above. 
This requires basically two things: 


(1) Reinterpret these systems in terms of linear algebra (i.e., column vectors 
and matrices) 

(2) Exploit as much of the Chapter [5] material as possible to make computa- 
tions as straightforward as possible. 
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Definition 6.1.2. A homogeneous linear system of differential equations 
(with constant coefficients) is a set of differential equations of the following form: 


x} (t) a1121(t) tere Qindn(t) 
ir, (t) a2121(t) ee A2nZn (t) 


(t) 


v(t) = Gnivi(t) +--+: + Ann tn(t) 


where each a;; € R and x;(x),...,2£,(t) are unknown functions. A solution to the 
({) is a collection of n differentiable functions 71, %2,...,0%,:I— R (where CR 
is an interval) such that plugging these functions in to (ft) makes each equation 
true. 


We will prefer to write systems in terms of matrices and vectors, so we can rewrite 
({) above as: 


x’ (t) a4 12 ain| [ai(t 
v5(t) dz1 a22 d2n| | £2(t) 
©, (t) Gn1 An2 ann rn (t) 
or even as: 
x’ = Ax 
where 
Git M2 *'* Ain r1(t) 
a2, 422 *** Aan 2(t) 
A= . . ; : and x = 
Gn1 An2 aan ann Tn (t) 
Note that with this notation, a solution is now a vector-valued function 
x(t): Ro R”. 


Example 6.1.3. We can rewrite Example now as: 
xi(t)}’ _ [1 2] [ai(t) 
v2 (t) a 2 i x(t) 


x’ = Ax where F ik 


or as just 


2 1 


We were given two distinct solutions, which we can now write as: 


ol) = |] and at) = [5 


—e 
To verify xq(t) is a solution, first we can compute the lefthand side: 
et f et 
xi) = | = [Se] = ot 


Next we compute the righthand side: 


wo = BS] = S22] - fs] = ne 
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Since the lefthand side equals the righthand side, we see that xo(t) is indeed a 
solution. (x,(t) can be verified in a similar way). 


Before proceeding with how to find the solutions to systems like ({), we will first 
say a few general things about what the set of all solutions can look like. Our first 
result should come as no surprise: 


Proposition 6.1.4. Suppose A € Matnx,(R) and xi(t),...,xx(t) are solutions to 
the system 


Then for every c1,...,Cx € R the linear combination cyx1(t) +--+ + cpxx(t) is also 
a solution. 


PROOF IDEA. This is because “taking the derivative” and “multiplying on the left 
by A” are both linear operations. 


Definition 6.1.5. Suppose x,(t), x2(t),...,xx(t) : J > R” are vector-valued func- 
tions. We say that x1(t),x2(t),...,x,(t) are linearly independent if for every 
C1,---,ce ER, if 


cyxi(t) +--+ +c,xz,(t) = 0 forallte J, 


then cy = c@ =-:-=cp = 0. 
Otherwise, we say that x; (t), x2(t),...,xx(t) are linearly dependent. 


Fortunately, the next fact says that checking linear (in)dependence of vector-valued 
functions boils down to checking whether certain R” vectors are linearly (in)dependent: 


Fact 6.1.6. Suppose x; (t), x2(t),...,xx(t) are solutions to x’ = Ax. If there is 
some fixed to such that the column vectors x;(to),...,Xx(to) € R” are linearly de- 
pendent (respectively, linearly independent), then the functions x; (t), x2(t),...,x,(t) 
are linearly dependent (resp., linearly independent). 


We can now state (without proof) the general theorem describing the structure 
of the set of all solutions. 


Theorem 6.1.7. Suppose A € Matnxn(R) and xi(t),...,Xn(t) are n linearly in- 
dependent solutions to 


x’ = Ax. 
Then xi(t),...,Xn(t) form a fundamental set of solutions, i.e., if xo(t) is an arbi- 
trary solution, then there are (necessarily unique) c1,...,Cn € R such that 


xo(t) = c1x1(t) + coxe(t)+-+-+enXn(t) for every t. 


Therefore, just as with homogeneous second-order linear differential equations and 
homogeneous matrix equations, the goal for linear systems x’ = Ax is to find an ap- 
propriate number of linearly independent solutions. We now proceed with actually 
computing solutions to equations x’ = Ax. The primary idea is the following: 


Proposition 6.1.8. Suppose A € Matnxn(R), » ts an eigenvalue of A, and v is 
an eigenvector associated to X. Then 


x(t) := ev 


is a solution to the system x' = Ax and satisfies the initial condition x(0) = v. 
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Proor. Let x(t) = ev be as in the statement of the proposition. Note that the 
lefthand side yields: 


20). = (ev)! = (ev = Av = Ax(t) 
Whereas the righthand side yields: 
Ax(t) = Ae*v 


e“Av = e* dv = Ax(E). 


Example 6.1.9. Find all solutions to the linear system: 


fee _ |-4 6 
x’ = Ax where a= [7 5 F 


SOLUTION. Proposition suggests we should first look for eigenvalues and eigen- 
vectors of A. First we obtain the characteristic polynomial: 


p(d) = det es = 1 


(4 7)6 = 2). —'6(—3) 
= 90 AL =—5A- A? 18 


= W-)-2 
= (A-2)(A41). 
Thus our eigenvalues are A; = 2 and Ay = —1. Now we will find the associated 


eigenvectors: 
(Ai = 2) We will find a basis for null(A — 27). Solving the associated homoge- 


neous equation yields: 
—6 6|0!} toRREF. {1 —11|0 
-—3 3/0 0 0 1;0 


Thus we found one eigenvector: 
= 1 
.= 1 


(Ag = —1). We will find a basis for null(A + I). Solving the associated homo- 
geneous equation yields: 


—3 6/0] toRREF {1 -—2/0 
—3 6/0 0 0;0 


Thus we found one eigenvector 


Proposition tells us that the following two vector-valued functions are 
solutions to x = Ax: 
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Since vi, V2 are linearly independent, Fact |6.1.6} tells us that x(t), xo(t) are 
linearly independent vector-valued functions. Finally, Theorem tells us that 
the general solution to x’ = Ax is: 


2t —t 
x(t; Cy, C2) Se Ce! Hl a Cre? | = ie + 2C'2€ | 


Cie" + Coe* 
6.2. Planar systems 


In this section we will take a closer look at the 2 x 2 case. In this case, the charac- 
teristic polynomial is a quadratic polynomial, so there are three cases: distinct real 
roots case, complex conjugate roots case, and double real root case. Furthermore, 
the double real root case splits into two cases (because of the linear algebra): an 
easy case and an interesting case. We will say what to do in all four of these cases. 


Distinct real roots case. The first case is when p(A) has two distinct real 
eigenvalues. We first give an example and then proceed with a general statement. 


Example 6.2.1. Find the general solution to the following linear system: 
a —_ f-l 1 
x’ = Ax where A = | 1-1 


SOLUTION. First we need to find the eigenvalues and associated eigenvectors of A. 
The characteristic polynomial is 


BOO. =e. Aet en Pos ‘| (525) Sh eek | 9) 


Thus the eigenvalues are Ay = —2 and Az = 0. Now we find the associated eigen- 
vectors. 
(A, = —2) We need to find a basis for null(A + 21). Note that 


1 1/0] torrer, {1 11] 0 
1 1/0 0 0] 0 
This yields the following eigenvector: 
(Ag = 0) We need to find a basis for null(A — OJ) = null(A). Note that 


—1 1 |0| torReF |1 —1)]0 
1 -1)/0 0 010 


This yields the following eigenvector: 


x1(t) = tty, at e 2 | 
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Next, since the two column vectors x;(0) = v1 and x2(0) = vg are linearly inde- 
pendent, by Fact it follows that the two solutions x;(t) and x2(¢) are linearly 
independent. Thus, by Theorem we conclude the general solution is 


—Cye~7# + | 


—o4 |—1 1 
x(t;C1,C2) = Cyx1(t) + Coxe(t) = Cie” | l | + C2 H = pee 


The general case works exactly the same way: 


Theorem 6.2.2 (Distinct real roots). Suppose A € Matgy2(R) has two distinct 
real eigenvalues 41 # Ap € R. Furthermore, suppose v1 is an eigenvector associated 
with A, and v2 is an eigenvector associated with Az. Then the general solution to 
x’ = Ax is 

x(t; Ch, C2) = Cre**vi + Cre?! vo. 


Complex conjugate roots case. The next case we will consider is when p(A) 
has a complex conjugate pair of complex (non-real) roots. 


Example 6.2.3. Find the general solution to Find the general solution to the 
following linear system: 


r — |O 1 
x’ = Ax where A = Be i 


SOLUTION. First we need to find the eigenvalues and associated eigenvectors of A. 
The characteristic polynomial is 


—Xr 1 
p(A) = det i | = -\2—A)+2 = 7-2\42 
and so the eigenvalues are 
2+ /4— 
A1,A2 = ave = 1+i 


so \y =1+iand A3=1-—i=,. 
(Ay = 1+%) We need to find a basis for null(A — (1 +7%)I). Note that 
—-1-i 1 |0} torrer. 1 (—1+%)/2 | 0 
—2 1-1] 0 0 0 0 


This yields the following eigenvector: 


— a 


1 
However, for convenience, we can scale v; by 1 +7 and instead use: 
_ 1 
heel ee 


(Az = 1 — 1) In this case, since Ag = 1, Av; = A1v1, and A = A, taking 
complex conjugates yields: 


Av; = \iv1 => AW = AoW 


This yields the following eigenvector associated to Ag: 


__ 1 
v2 = Ww = Fret 
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Next, Proposition tells us that the following are both solutions to x’ = Ax: 


i 1 


1-2 


; 1 
z(t) := ety, = al | 


However, we are not done yet since z,(t) and z2(t) = z(t) are complex-valued 
solutions and we are ultimately looking for two linearly independent real-valued 
solutions. To find real-valued solutions, we can essentially do the same trick we 
used for Theorem i.e., taking the real- and imaginary-parts of z,(t). To 
justify this, recall from Proposition that the set of all solutions to x! = Ax is 
closed under linear combinations. Thus 

Z(t) + Zo(t 

x(t) — — — Re (zi (t)) 
Z(t) — Za(t) 


y(t) := a" aia Im (z; (t)) 


are also both solutions. Now we will use Euler’s formula to get a better description 
of x(t) and y(t). Note that 


— vasae | 1 
z(t) = e re 


= e'(cost + isint) ({;] +4 AD 
= et (cos: Hl —sint | + ie’ (cost | +sint 1) 


er cost ges sint 
~ cost — sint cost + sint 


Taking real and imaginary parts yields: 


ar: cost 
al) Se te ee — sin / 

_ t sint 
ya) =e ie + sin / 


Finally, since x(0) = Al and y(0) = | are linearly independent, it follows that 


x(t) and y(t) are linearly independent. Thus the general solution to x’ = Ax is 


; _ _ t cos t t sint 
w(t;C1,C2) = Cix(t) + Cay(t) = Cie e — sin / + Cre & + sin ’ 
The general case works exactly the same way. 


Theorem 6.2.4 (Complex conjugate roots). Suppose A € Mat2x2(R) has complex 
conjugate eigenvalues A,A ¢ R, and w is an eigenvector associated to A. Then W 
is an eigenvector associated with A. Furthermore: 
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(1) (Complex version) The general solution to x’ = Ax in terms of complez- 
valued functions is: 
x(t;C ,C2) = Crew + Cow 
(2) (Real version) The general solution to x’ = Ax is terms of real-valued 
functions is: 
x(t;C,,C2) = Cye%'(cos Bt v1 — sin Bt v2) + Cpe (sin Bt v1 + cos Bt v2) 
where X=a+if and w =v, + ive. 
Double real root easy case. We now turn our attention to the case when 


p(A) = (A— A,)?, ie., when the characteristic polynomial has only one root of 
multiplicity two. First, we point out that exactly one of two things can happen: 


(1) (Easy case) Either we can find two linearly independent eigenvectors 
V1, V2 € R? associated to Aj. An example of this case is 


which has linearly independent eigenvectors: 


o-fwn-§ 


(actually any two linearly independent vectors in R? would work for this). 
(2) (Interesting case) Or we can only find one linearly independent eigenvector 
v; € R? associated to Ap. An example of this is case is 


2 he, 8 
a= |? 4 


which has only one linearly independent eigenvector: 


of 


We will first look at the easy case. We will actually be able to completely solve the 
easy case, due to the following fact: 


Fact 6.2.5. Suppose A € Mat2,2(R) has one real eigenvalue \ of multiplicity two. 
Furthermore, suppose we can find two linearly independent eigenvectors associated 


to A. Then 
rA 0 
4=[5 5] 


“fei 


PRooF. Suppose v1, vi € R? are two linearly independent eigenvectors of A asso- 
ciated to A. Then null(A — AJ) = Span(vi, v2) = R?. In particular, we know the 
following two vectors are also linearly independent eigenvectors of A associated to 


of) ae f 


and 


form an eigenbasis of A. 
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Now suppose a, b,c,d € R are such that 
a b 
ae 
Then the condition Ae; = Ae, tells us that a = A,c = 0, and the condition 
Aeg = Ae tells us that b = 0,d = X. Thus 


sf 


Theorem 6.2.6 (Double real root; easy case). Suppose A € Mat2x2(R) has only 
one eigenvalue X € R (of multiplicity two). Furthermore, suppose we can find two 
linearly independent eigenvectors of A associated to A. Then the general solution 
to x’ = Ax is 


This yields the following: 


1 0 Cie 
x(t;C1,Co) = Cie a + Coe H = leiex| 


PROOF. Let 


x(t) =e | 


By assumption, the eigenspace of \ is two-dimensional, so it must be all of R?, 
Thus the following two vectors are eigenvectors associated to A: 


=f) af 


Thus by Proposition both x;(t) and x2(t) are solutions to x’ = Ax. Further- 
more, since x; (0), x2(0) are linearly independent, it follows that x;(t),x2(t) are also 
linearly independent. Thus by Theorem it follows that the general solution is 


At 
x(t; C1, C2) = C,x1(t) + Coxea(t) => Cie" C er eT a 0 = Oe 
0 1 Cre 


Double real root interesting case. We now proceed with the interesting 
case. We investigate it by example. 


Example 6.2.7. Find the general solution to x’ = Ax, where 
1 1 
=[ i 
SOLUTION. We begin by finding the eigenvalues and associated eigenvectors of A. 
The characteristic polynomial is 


pA) = aet [15 | = (1-A)(1—A) = (A-1)?. 


Thus \; = 1 is the only eigenvalue (of multiplicity two). Now we find all of the 
associated eigenvectors, i.e., we compute a basis for null(A — 17). Note that 


Sloe 
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is already in RREF. Since there is one free variable, there is only one linearly 
independent eigenvector: 


This tells us that 


t=. a 


is a solution to x’ = Ax. We are not done yet because we still need a second linearly 

independent solution. However, it appears that we are stuck since we don’t have 

any more linearly independent eigenvectors of A (i.e., A fails to have an eigenbasis). 
The solution is to guess that x’ = Ax has a solution of the form: 


x(t) = erit(y, + tv.) 


for some vectors v1, V2 € R?. Supposing we have a solution of this form, lets see 
what this means for the vectors v1, V2. Note that 


x(t) = dre" (ve +tvi)+ Atty, = e™* (Arve +vi)+ Aitv1) 
whereas 
Ax(t) = e*!(Av2+tAvi) 
Equating these expressions and dividing by e*"* (which is never zero) yields 
(Aiva +vi) + Aitv1 = Avg+tAvi 

Since this needs to be true for all t, this yields: 

Ava = Aivot+vi 

Av, = Alvi. 


In other words, v; must be an eigenvector associated to A,;, and v2 must be a 
solution to the equation 

(A — Ail)ve = Vi. 
We have already found above that 


“= 


works as an eigenvector. Now we will solve the equation 
(A = Ail)v2 = Vi. 

Setting up the augmented matrix yields: 

0 1) 1 

0 0)0 
which is already in RREF. We find that 2 is a free variable, x2 is a pivot variable, 
and the general solution is 

v1, = @ 


t= 1 


which in vector form is 
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Thus 


is a particular vector that works. This gives us the solutions: 


w= ensiny =e (Bef) = fe 


We conclude the general solution is 
x(t; Cy, C2) = C,x1(E) + C2x2(t) = Cie! a + Cre! ({"] +t al) 


= + ad 


Coe! 
The general situation works exactly the same way: 


Theorem 6.2.8. Suppose A € Matey2(R) has only one eigenvalue X € R (of 
multiplicity two). Furthermore, suppose we can only find one linearly independent 
eigenvector v, of A associated to A. Then the general solution to x' = Ax is 

x(t; Ch, C2) = Cievy + Cre™ (v2 + tv1) 


where v2 € R? is any particular solution to the matrix equation (A — \I)v2 = v1. 


6.3. Higher-order linear equations 


This section is a sequel to Chapter [4] specifically Sections and ??. There we 
considered second-order linear equations: 


y" +p(t)y'+a(ty = g(t), 
and specifically homogeneous second-order linear equations with constant coeffi- 
cients: 
y’+py+qy = 0 withp,geER. 
In this section] we will discuss homogeneous nth order linear equations with con- 
stant coefficients: 
y” + ayy") ++) +an-1y' +any = 0 with ay,...,a, €R. 
We will solve these equations in a three-step process: 


(1) Convert the nth order linear system (in one unknown function) to ann xn 
linear system (with n unknown functions). 

(2) Solve the nth order linear system. 

(3) Convert the solution back in terms of a solution of the original linear 
differential equation. 


We begin with a fairly representative example: 
Example 6.3.1. Find the general solution to: 
y4) — 13y" + 36y = 0. 


ln [2] §9.8] they consider more general linear equations of the form y(”) +.a1(t)y"—) +---+ 
an—1y’ + an(t)y = F(t) which might not have constant coefficients and might be inhomogeneous 
with a nonconstant forcing term. For us we will restrict our discussion to the homogeneous 
constant coefficient case. 
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SOLUTION. This is an equation with one unknown function. The first thing we 
do is convert this into an equation with four unknown functions by introducing 
three more auxiliary variables. Note that we will have to deal with four derivatives 
of y(t), so to turn this into a first-order linear system, we define x2(t) := y'(t), 
a3(t) := y(t) = x5(t), and r4(t) := y/"(t) = x(t). Finally, to make the notation 
uniform, we also set 21 (t) := y(t). This gives us the obvious conditions: 


x(t) = e2(t) 
x(t) = e9(t) 
x(t) = a(t) 


What about «/,(t) = y“)(t)? The original differential equation itself tells us how to 
relate this to the lower derivatives: 


x(t) = 13y"(t) — 36y(t) = 13x3(t) — 3621(t) 
Combining these four equations yields the system: 

i] | 0 1 0 Hi baa 
xy(t)| 0 O 1 OF jae( 
xy(t)| 0 0 0 1] Jas( 

xi, (t) —36 0 13 Of |aa(t 
Of course, ultimately we are only interested in the first unknown function x(t), but 
this is a quantity which we can read off as the first unknown function in a solution 


to the above system. Let’s proceed to solve this system. 
The first step is to compute the characteristic polynomial: 


p(A) = det(A— AL) 
-r 1 0 0 
= 0 -rA 1 0 
= 0 0 -A 1 
—-36 0 138 -A 
1 0 0 -rX 1 0 -r 1 0 
= 386det}—-A 1 O} —13det} 0 —-A O} —Adet} 0 -A 1 
0 -A 1 0 0 1 0 0 —-A 
cofactor expansion along bottom row 
= 36—13\?+\4 
= (A—2)(A+ 2)(A — 3)(A +3) 
This gives us four eigenvalues, A; = 2, A2 = —2, A3 = 3, Ag = —4. Next we compute 


the corresponding eigenvectors (calculation omitted): 


1 —1 1 —1 
2 2 3 3 
Vai 4}? V2. —4|° i 9 |? eo ~9 
8 8 27 27 
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In this case we have a real eigenbasis, so the general solution to x’ = Ax is: 


1 “1 1 = 
KOC Cy Cy Se |7) eGye te] 7.) aeege 2 | reie=se | 
;C1,C2,C3,Ca) = e 4| + G2e gS eae = (ag ee _9 

8 8 27 27 


Ce?" _ Cre" LL Cze** _ Cye** 
2C 1 e7!2C2e~** + 3C3€7# + 3Cye—** 
AC 7" — AC 2e-7# + 9C'3€3* — 9Cye~** 
8C,e7# + 8Cze~7# + 27C3e%" + 27Cye—*# 


In particular, the general solution to y“ — 13y” + 36y = 0 is 
y(t; Ch, Co, C3, C4) => x(t) = Ce = Cre~¢ + Cxe* = Cie ** 


which we might as well instead write as 


y(t) = Ce" + Cre 7* + C3e%" + Cre ** 


In general, suppose we have an nth order homogeneous linear differential equation 
with constant coefficients: 


y”™ + ayy») feee Gn—1y" +Qany = 0 with Q1,---,4n € R. 


Then we can introduce n — 1 additional unknown functions to stand for the higher 
derivatives of y: x(t) := y(t), vo(t) = x(t) = y(t), v3(t) = eh (t) = y"(0),.-., tn (t) = 
a! _,(t) = y(t). This gives the equations: 

a(t) = x(t) 

xh(t) = xa(t) 


1 (t) = %n(t) 


Additionally, we can relate x! (t) = y(t) to the lower derivatives using the original 
differential equation: 


x(t) = —an21(t) — dn_122(t) — +--+ — a1 2(t) 


We then form the linear system: 


x} (t) ee De NT eae) 
w(t) 7 Geter A | Il eae) 
v1 (t) a ? 0 1 Ln—1(t) 
x(t) Gig ted Gag Gy] Len 


This gives us an n x n linear system of the form x’ = Ax. The matrix A in this 
context is called the companion matrix. The following sums up what is true in 
general: 
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Theorem 6.3.2. Consider the nth order homogeneous linear differential equation 
with constant coefficients: 

(A) y™ + ayy) ++) +an-1y +any = 0 with ay,...,an ER. 

and let 

(B) x’ = Ax 

be the associated linear system. 


(1) The following are equivalent: 
(a) y(t) ts a solution to (A) 
(b) the vector-valued function 


yo"Vit) 
is a solution to (B). 
(2) Suppose yi(t),...,Yn(t) are solutions to (A). The following are equivalent: 
(a) yr(t),.-.,Yn(t) are linearly independent (as real-valued functions) 
(b) The following vector-valued functions are linearly independent: 


yi(t) Yn(t) 
yi (t) Yn (t) 
OM Ne EO 
nt) yn (t) 
(c) For some to the following determinant is nonzero: 
yr (to) y2(to) +*+ — Yn(to) 
y; (to) yo(to) +++ yn (to) 
det , ie 
th"? (to) ye Mto) <n’ (to) 
(d) For every t the following determinant is nonzero: 
y(t) yo(t) ++ — Ynlt) 
i (t) Yolt) --- Y(t) 
det : Z : 
APE) we) PE) 


This motivates the following definition: 


Definition 6.3.3. Let y1,...,yn : I > R be real-valued functions (I C R is an 
interval). We define the Wronskian of y1,...,Y, to be the function 


y(t) yo(t) ++: Yn(t) 
y(t) y(t) yh) 
W(t) = det : 6 ; 
oe Bee) Magee 
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We can now summarize everything in terms only of the original differential equation: 
Proposition 6.3.4. Suppose y1(t),...,Yn(t) are solutions to 
y™ +ayy YD +---4+an_1y' tany = 0 with ay,...,an ER. 
Then y1,---;Yn are linearly independent iff W(t) 4 0 iff W(to) £0 for some fixed 
to. In this case, the general solution is 
y(t) = Ciyr(t) + Coye(t) +--+ + Cryn(t). 


Of course, we are sweeping a few explanations under the rug. However, at this point 
you should believe that everything is properly justified using routine linear algebra 
arguments similar to those needed for homogeneous second-order linear equations, 
homogeneous matrix equations, and homogeneous linear systems. 
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APPENDIX A 


Special functions 


In this appendix we will include an overview of relevant properties of common 
elementary functions which arise in calculus and differential equations. In general 
we will work within the realm of real numbers, although everything we say has an 
appropriate extension to the bigger world of complex numbers. However, we might 
occasionally have to refer to complex numbers every now and then. 


A.1. Polynomials 
A polynomial (in the single variable X) is an expression of the form: 
P(X) = anX” + an. X" 1 4-+++a2X*+a,X +a9 (where each a; € R) 


If a, #4 0, then we call n the degree of p(X), denoted degp = n. We may also 
choose to write a polynomial in summation notation: 


k=0 
We naturally construe a polynomial as a function p : R > R by declaring for a € R: 


at ae ee ee 2 | 
T 


p(a@) = ana” + an—10 Fag,a* + aya + ag 


Recall that given two polynomial p(X) = )7p_p azX* and q(X) = peg bn X*, we 
can form their sum: 


and their product: 


(p-a)(X) = Do do aid; | X* 
k 


itj=k 
where the above sum ranges over all possible indices. 


Polynomials are perhaps the most well-behaved type of function which shows up in 
calculus. Indeed: 


Fact A.1.1. Suppose 
n 
p(X) = A,X" + deg +-+-+a,X +a) = Sa, X* 
k=0 


is a polynomial of degree n. Then the following facts are true about p(X) as a 
function p: R > R: 


(1) p is continuous on all of R. In particular, for every a € R: 
Jim p(x) = p(a) 
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2) The limits at infinity are computed as follows: 
( y 
(a) ifn = 0, then 


Jim p(x) = lim p(x) = ao 


(b) if n > 1 is even, then 


lim p(x) = lim p(x) = 


«LOCO xwL—->—0Co 


(c) ifn > 1 is odd, then 
—o ifa, <0 


(oe) ifa, >0 7 
li = d i} = 
jim plz) fee if a, <0 ia so a Pay 0 


(3) p is differentiable on all of R with derivative 


d 
(x) = nag X” '+ (n— lagi X” 7? +--+ + 2agX + ay 
n n-1 
= So ka,X** = So (k+1anyiX* 
k=1 k=0 


(4) Since the derivative of a polynomial is again a polynomial, p is infinitely 
differentiable on all of R, 
(5) Define the degree n + 1 polynomial: 


P(X) := aaa ee ! =x" biie 5X? + ag X 
n+l n 


Gk-1 yk ak k+l 
» k k+1 
k=1 k=0 


Then: 
(a) P(X) is an antiderivative of p(X), i-e., 
d 
© P(X) = p(X), 
x 


(b) the indefinite integral of p(X) is 
[eo dX = P(X)+C, 
(c) the definite integral of p(X) is 


b 
i: p(X)dX = P(b) — P(a), 


for every a,b € R. 
The following is an important theoretical tool for studying polynomials: 
Fundamental Theorem of (Complex) Algebra A.1.2. Suppose n > 1. Then 


for every polynomial 
P(X) = anX” +an1X" —14+---+a,X +49 
., Qn € C such that 


of degree n, there exists complex numbers ay,.. 
p(X) = an(X — ay)(X — ag)-+-(X — ap). 
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The numbers a4,...,Qn in[A.1.2|nced not be distinct. One (very minor) drawback 
of [A-1.2] is that some of the roots might be complex numbers which are not real 
numbers. Since we usually want to stick to working entirely with real numbers, the 
following variant will be useful for us: 


Fundamental Theorem of (Real) Algebra A.1.3. Suppose n > 1. Then for 
every polynomial 
p(X) = AnX" + Gn1X" —1+-+-+a,X + a9 


of degree n, there exists r,s CN withr +2s =n, and real numbers 


Oy 0024 Ory G1, +25 Bar Virsa Vs € R 
such, that: 
(1) p can be factored into linear and quadratic factors 
P(X) = Gn (X —a4)-++(X — ay) (xX? ae oe (Xx? +-feX +); 
OE SE 
linear factors quadratic factors 


and 
(2) for each i = 1,...,8, we have B? — 4y; < 0, i.e., the quadratic factor 
X74 8X +7; does not have real roots. 


Theorem is an easy consequence of Theorem since complex roots of 
polynomials occur in conjugate pairs. Combining these conjugate pairs together is 
what give rise to the quadratic factors. 


When dealing with quadratic polynomials with no real roots, the following trick is 
essential: 


Completing the Square A.1.4. Suppose a,b,c € R are arbitrary such that a £ 0. 


Then 
b\?  4ac— 0b? 
xX 
( 7 =) . 4a? | 
If the discriminant b? — 4ac < 0 is negative, then the constant (4ac — b?)/4a? > 0 
ts positive. 


(x+m) te-E 
aX*+bX+c= al X4 tC =a 
4a 


A.2. Rational functions 

A rational function (in the single variable X) is an expression of the form 

Am X™ + Am XE He +ayX +a 
bp X” + bpp X"-1 +--+ + 1X + bo 


i.e., a rational function is a quotient 


r(x) = —— 
of polynomials, where p(X) = ay,X™ +--++ ao and g(X) = b,X" +--+ + do. 
Recall that given two rational functions ro(X) = po(X)/qo(X) and ri(X) = 
pi(X)/q(X), we can form their sum: 
po(X)qi(X) + pi(X)qo(X) 
qo(X)q(X) 


r(x) = (where a;,b; € R) 


(ro +1 )(X) = 
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and their product: 
po(X )pi(X) 
ro T1)(X) := 

trom) = Xan (X) 
Just as with polynomials, we naturally construe a rational function as a real-valued 
function. Since the denominator of a fraction is never allowed to be zero, the domain 
of r(X) = p(X) /q(X) is: 


domain(r) := {a€R:q(a) 40} CR 


Then we define the function r : domain(r) > R by declaring for a € R: 


Warning A.2.1. In general the domain of a rational function might exclude so- 
called removable singularities. For example, consider the following two rational 
functions: (X £1)(X +2) eis 
+ + + 
xX) = d x) SS 
MN) = eee), en area 


Then as real-valued functions, we have 
domain(ro) = R\{-1,-3} and domain(r1) = R\ {—3} 


i.e., Tq is defined everywhere except —1 whereas r; is defined everywhere except 
—3. However, for every a € R \ {—1, —3}, we have ro(a@) = r1(@). In other words, 
ro and r; are essentially the same real-valued function except that r, is defined 
at one more point than rp is. In some sense, the fact that ro does not have —1 
in its domain is an artificial obstacle. It is due to the factor « + 1 occurring in 
both the numerator and denominator. Since this has no effect on the value of the 
function (since it contributes multiplication by 1), we can just cancel these factors 
out and gain an extra point where the function is defined. In practice, when working 
with rational functions, you always want to make sure that the numerator and the 
denominator have no common factors so that you can work with the largest possible 
“true” domain of the rational function. 


In the rest of this section, we will ignore the issue of removable singularities. After 
polynomials, rational functions are the second best-behaved family of functions 
which show up in calculus: 


Fact A.2.2. Suppose 


is a rational function with domain D := domain(r). Then the following facts are 
true about r(X) as a function r: D > R: 


(1) r is continuous on all of D. In particular, for every a € D: 
Jim r(x) = r(a) 
(2) r is differentiable on all of D with derivative 
dr _ AX) ae (X) — w(X) EX) 
dx *) = 2 
(q(X)) 


which is also a rational function with domain D. 
(3) It follows that r(X) is infinitely differentiable on D. 
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Integration of rational functions is a little bit more complicated and requires so- 
called partial fraction decomposition. First, some terminology: 


Definition A.2.3. Suppose r(X) = p(X)/q(X) is a rational function. We say that 
r(X) is a proper rational function if degp < degqg. Otherwise, we say that r(X) 
is an improper rational function. 


We have two versions of partial fraction decomposition, depending on whether every 
factor of the denominator is linear or not: 


Partial Fraction Decomposition A.2.4 (Complex Case). Suppose 


r(x) = => 


is a proper rational function with degq =n. Then: 
(1) By Theorem [A.1.9] there exists a nonzero real number a € R, distinct 


complez numbers a1,...,Q@, € C, and positive integers nj,...,n, € N 
such that 

(a) ny +--+ +n =n, and 

(bo) q(X) = a(X — a4)" +++ (X — ay)" 


a Er 


~ PX) _ yryh Ais 
(A.1) r(X) = xy 2a 


You should use any time every root of g(X) is real, or if you want to work 
with complex numbers. If not every root of g(X) is real and you want to avoid 
using complex numbers, then you should use the following: 


Partial Fraction Decomposition A.2.5 (Real Case). Suppose 


r(x) = — 


is a proper rational function with degq =n. Then: 
(1) By Theorem |A.1.3| there exists r,s € N such that r+ 2s =n, a nonzero 


real numbers a € R, distinct real numbers ay,,...,a4 € R, positive integers 
N1,--.,N4, distinct pairs of real numbers (81,71),---;(Bus Yu) € R* and 
positive integers n,...,ni, such that: 

(a) ny +--- +m =7, 

(b) ni +--+ +n, =s, 


(c) the denominator factors as: 
g(X) = a(X —ay)P +++ (X — ay)" (X? + BX +) (X? + BuX +10)” 


(d) for everyi =1,...,u, we have 8? —4y; <0, i.e., the quadratic factor 
X74 BX + 7, does not have real roots. 
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For improper rational functions, we can write it as a polynomial plus a proper 
rational function: 


Polynomial Division A.2.6. Suppose p(X) and q(X) are polynomials: 


P(X) = anX™ +++++a9 
GOO) = 2" pet hy 


with degp = m > degg = n, i.e., the rational function r(X) = p(X)/¢q(X) is 
improper. Then: 
(1) The following identity reduces the degree of the polynomial in the numer- 
ator: 


P(X) _ am ym—n , PLX) = (m/bn)X™ "a(X) 


q(X) Bs q(X) 


where deg (p(X) — (am/bn)X™—"q(X)) < deg p(X). 
(2) By repeating (1) enough times, there are real numbers Cm—n, Cm—n—1;-+++C0 © 
R with Cn—-n #0, and a polynomial p(X) with deg p(X) <n, such that: 
p(X) P(X) 


a Cuan a aa ies a cx Co ay 
q(X) q(X) 


It follows that any rational function can be written as a polynomial (possibly zero) 
plus a partial fraction decomposition of the form (A.1) or (A.2). Once we decompose 
a rational function like this, then we can integrate it according to the following rules: 


(1) Integrate the polynomial part according to Fact|A.1.1]{5). 
(2) For functions of the form 1/(X — a), a € R, the indefinite integral is: 


dx 
ES ee 
[x= n| al +C 


with domain (—co, a) U (a, +00). Given a < b € R, the definite integral 
is:: 


a X-—a 


Res = In(a—b)—-In(a—a) ifa<b<a 


tf dX = In(b—a)—-In(a-—a) ifa<a<b 


(3) For n > 2, functions of the form 1/(X —a)”, a € R, the indefinite integral 
is: 


dX 1 
i (X-—a)™ —  (n—1)(X —a)n-1 G 


with domain (—oo, a) U (a, +00). Given a < b € R such that a<a<b 
or a<b<aa, the definite integral is: 


ie dX | 1 1 

@ =o) ~ Gia Gelb 

(4) If B,y € R are such that 8? — 4y < 0, to compute the integral of 1/(X? + 
BX +7), you first complete the square in the denominator: 


1 1 1 
X?4+BX+y  (X—B/2)? + (47-82/4 — (X—B/2)? +6 
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(where 6 := (4y — 6)/4) and the integrate using arctangent. The indefi- 
nite integral is: 


/ dX _ / dX 
X24 BX+7 — (X — 8/2)? +6 
= Jz atetan (=) +C 


with domain R. Given a < b € R, the definite integral is: 


feiss ~ (om S92) mom (S22) 


(5) If 8,7 € Rare such that 8?—4y < 0 and B € R, to compute the integral of 
(X+B)/(X?+ 6X +7), you first complete the square in the denominator: 


X+B X+B 


X?4+BX+y (X — B/2)? +6 
Then you rewrite the numerator into two parts: 
X+B 1 2(X — 6/2) B+ 6/2 


(X— 8/2245  2(X—B/2)?+6 | (X—B/2)? +6 
The integral is the second part is done as in (4), the indefinite integral of 
the first part is: 


12(x—8/2)dxX — 1 : 
Voce pan = 5 In |(X 8/2)? +6|+C 
with domain R. 


(6) If 6,7 € R are such that 8? — 4y < 0 and n > 2, to compute the integral 
of 1/(X?+ 6X +7)", you first complete the square in the denominator: 


1 1 


(X24+BX+y7)" — ((X —B/2)2 +6)" 


Then to compute the antiderivative, you first do the substitution U = 
X — 6/2, dU = dx: 


| dX 7 dU 
((X — 8/2)? +8)" (U? +9)" 
Then you do the substitution W = U/V6, dW = dU/V5: 


/ dU if v6 dW _ v6 dW 

(U? +4)" (Vow)? +5)" oo" J (W? +1)” 

Then to compute [ dW/(W? +1)” you use the trigonometric substitution 
W =tan0O, dW = sec? OdO: 


/ dw 7 / sec? 0 dO 
(W2+1)” J (tan?Q@+4 1)” 


2 
= —— = [co 2ea0. 


sec2” © 


At this point you use the rules for integrating powers of cosine. 
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(7) If 8,y € R are such that 6? — 4y <0, BER, and n > 2, to compute the 
integral of (X + B)/(X* + 8X +7) you complete the square and break 
up the numerator as in (5): 
X+B 1 xX=6/) B+ /2 
((X — B/2)2+6)" — 2 ((X — 6/2)? +6)" ° ((X — 6/2)? +6)” 
Then the second integral is computed as in (6), and the first integral is: 
ie XX —B/2)dX  _ 1 
2 ((X — 8/2)? +6)” 2(n — 1)((X — B/2)2 +6)” 


-1 


A.3. Algebraic functions 
A.4. The exponential function 


The exponential function is the most important function in mathematics] Here is 
its definition: 


Definition A.4.1. Define the exponential function to be the function exp : 
R — R defined by 


co n 


exp(a) := ye, 


n=0 
for every aE R. 
In general we will never use the definition of the exponential function explicitly in 
this class, we will only use known properties of the exponential function. Here are 
some basic properties of the exponential function: 


Fact A.4.2. Suppose a, 8 € R are arbitrary. Then we have: 


(1) exp(a + 8) = exp(a) exp(), 

(2) exp(0) = 1, 

(3) exp is strictly increasing, i.e., if a < 8, then exp(a) < exp(@), and 
(4) for every a, exp(a) > 0, and in particular, exp(a) 4 0. 


The exponential function is an extremely well-behave function in calculus: 


Fact A.4.3. The function exp : R — R has the following properties: 


(1) exp is continuous. In particular, for every a € R, 


jim exp(x) = exp(a) 


(2) the limits at too are as follows: 


, lim, exp(z) = +00 and jim exp(x) = 0. 


(3) In particular, range(exp) = {a € R: x > 0} = (0, +00). 
(4) exp is differentiable and 


* exp(2) = exp(z). 


(5) It follows that exp is infinitely differentiable. 


ISee [3] pg. 1]. 
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(6) The indefinite integral of exp is: 


[exe) dx = exp(z)+C 


(7) Give a < bE R, the definite integral of exp is computed as: 
b 
/ exp(x) dz = exp(b) — exp(a). 


A.5. The logarithm 


We saw in Section [A.4] that the exponential function exp : R > (0, +00) is strictly 
increasing. In particular, it is invertible. 


Definition A.5.1. We define the logarithm (or natural logarithm) to be the 
function In : (0,-+oo) + R defined by: 


Infy) = 2 :=> exp(x) = y 
for all ¢ € R and y € (0,+00). We also denote In by log. 
Here are some basic properties of the logarithm: 


Fact A.5.2. Suppose a, 8 € R are arbitrary. Then we have: 


(1) In(aB) =Ina+InZ, 
(2) Inl =0, and 
(3) In is strictly increasing, i.e., if a < 8, then na < In8. 


The logarithm is also a well-behaved function in calculus: 
Fact A.5.3. The function In : (0,-+-00) > R has the following properties: 
(1) In is continuous. In particular, for every a € (0, +00), 


lim lInz = Ina 
La 


(2) the limits at 0 and +oo are as follows: 
lim Ing = —co and lim Inz = +o. 
xz—0+ t—+oo 
(3) In particular, range(In) = R. 
(4) In is differentiable and 
d 1 


—Ing = — 
x 


dx 


(5) It follows that In is infinitely differentiable on (0, +00). 
(6) The indefinite integral of In is: 


[ocae = ¢tlnxr—274+C, 


where this family of antiderivatives is defined on (0, +00). 
(7) Given 0 <a<b€R, the definite integral of In is computed as: 


b 
/ Inadx = blnb—b-—alna+a 
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Foundations 


Occasionally in this class we shall mention things like: 
Sets 

Operations on sets, like union, intersection... 
Ordered pairs and cartesian products 
Relations and functions 


For this class, you only need a working understanding of these concepts at the level 
of Math31B. However, we include a more rigorous treatment of these topics in this 
appendix if you desire a deeper understanding. 


B.1. A Word about Definitions 


When we write “X := Y”, we mean that the object X does not have any meaning 
or definition yet, and we are defining X to be the same thing as Y. When we write 
“X = Y” we typically mean that the objects X and Y both already are defined and 
are the same. In other words, when writing “X := Y” we are performing an action 
(giving meaning to X) and when we write “X = Y” we are making an assertion of 
sameness. 


In making definitions, we will often use the word “if” in the form “We say that ... 
if...” or “If ..., then we say that ...”. When the word “if” is used in this way 
in definitions, it has the meaning of “if and only if” (but only in definitions!). For 
example: 


Definition B.1.1. Given integer d, n we say that d divides n if there exists an 
integer k such that n = dk. 


This convention is followed in accordance with mathematical tradition. Also, we 
shall often write “iff” or “=” to abbreviate “if and only if.” (Only mathematicians 
do this!) 


B.2. Sets 


A set is a collection of mathematical objects. Mathematical objects can be almost 
anything: numbers, other sets, functions, vectors, relations, matrices, graphs etc. 
For instance: 
(257s S008, Ob, “and. 41g y5i Tash 

are all sets. A member of a set is called is called an element of the set. The 
membership relation is denoted with the symbol “ce”, for instance, we write “2 € 
{2,5,7}” (pronounced “2 is an element of the set {2,5,7}”) to denote that the 
number 2 is a member of the set {2,5,7}. There are several ways to describe a set: 

(1) by explicitly listing the elements in that set, i.e., the set {2,5, 7} is a set 

with three elements, the number 2, the number 5, and the number 7. 


139 


145 


140 B. FOUNDATIONS 


(2) by specifying a “membership requirement” that determines precisely which 
objects are in that set. For instance: 


{n € Z: nis positive and odd } 
ee SS 
membership requirement 


is the set of all odd positive integers. The above set is pronounced “the set 
of all integers n such that n is positive and odd”. The colon “:” is usually 
pronounced “such that”, and the condition to the right of the colon is the 
membership requirement. Defining a set in this way is sometimes referred 
to as using set-builder notation since you are describing how the set 
is built (in the above example, the set is built by taking all integers and 
keeping the ones that are positive and odd), instead of explicitly specifying 
which elements are in the set. We could also choose to describe the set 
above by writing 
115355, Fc e}, 

although this might be a less ideal description because it requires the 
reader to guess or infer the meaning of “...”. 


The following is a very famous set: 


Definition B.2.1. The empty set is the set which contains no elements (hence 
the name). It is denoted by either 0 or {}. 


The following are some of the main relationships two sets can have: 


Definition B.2.2. Suppose A and B are sets. We say that 
(1) A is a subset of B (notation: A C B) if every element of A is also an 
element of B, i.e., 
e For every 2, if x € A, then x € B 
(2) A is equal to B (notation: A = B) if A and B have exactly the same 
elements, i.e., 
e For every 2, x € A if and only ifxe B 
equivalently, A = B means the same thing as AC Band BC A 
(3) A is a proper subset of B (notation: AC B)if AC Band AFB. 


Note that for any set A, we automatically have @ C A. 


Definition B.2.3. Given sets A and B, we define their union (notation: AU B) 
to be the set of all elements that are in either A or B, i.e., 


AUB := {u:x€ Aorxe B}. 


FicurE B.1. Venn diagram of the union AU B of the sets A and B 
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Definition B.2.4. Given sets A and B, we define their intersection (notation: 
Al B) to be the set of all elements they have in common, ie., 


ANB := {x:x€ Aand xe B}. 
We say that two sets A and B are disjoint if AN B = 0. 


FIGURE B.2. Venn diagram of the intersection AM B of the sets 
A and B 


Definition B.2.5. Given sets A and B, we define their (set) difference (or 
relative complement) (notation: A \ B) to be the subset of A of all elements in 
A that are not in B, ice., 


A\B := {a:a%€ Aanda ¢ B}. 


FIGURE B.3. Venn diagram of the difference A \ B of the sets A and B 


Suppose we have elements a, b,c,d such that {a,b} = {c,d}. It is tempting in this 
situation to conclude that “a = c and b = d”, but in general this is false. Indeed, 
we have {1,2} = {2,1}, but 1 4 2 and 2 4 1. This is because elements of a set 
are unordered. To get an ordered version of a two-element set we introduce the 
so-called ordered pair construction. 


Definition B.2.6. Given objects a and b, we define their ordered pair to be the 


object: 

(a,b) = {{a}, {a, b}} 
The righthand side of the definition might seem a little funny, but it guarantees the 
following: 


Ordered Pair Property B.2.7. For every a,b,c, d, 
(a,b) = (c,d) if and only if a=c andb=d. 
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Proor. Exercise! 


In practice, the Ordered Pair Property is really the only feature of ordered 
pairs that is ever relevant. You will almost never have to actually deal with the 
definition a {a}, {a, byt”, except when it comes proving the Ordered Pair Property. 


Definition B.2.8. Given sets X and Y, we define the cartesian product (of X 
and Y) (notation: X x Y) to be the following set: 


XxY := {(a,y):c€X andyeY} 


Example B.2.9. Suppose X = {0,1} and Y = {a,b,c}. Then the cartesian 
product of X and Y is 


XXY = {(0,a), (G0), (G,0), 4,9), 4,8), G,e)} 
Note that |X| = 2, |Y| = 3, and |X x Y| =2-3=6. 


The construction of pairs can be repeated: 


Definition B.2.10. We define ordered triples, ordered quadruples, and more 
generally ordered n-tuples recursively as follows: 


(a1, 42,43) 7= ((a1, a2), a3) 


((a1, @2, 43), a4) 


(a1, d2, 43, 44) : 


(@1,---,@n41) 2= ((a1, nee An), On+1) 
for any objects a1, da2,a3,.... It follows that two ordered n-tuples (a1,...,@,) and 
(b,,..., bn) are equal iff a; = b; for each i € {1,...,n}. Given sets Aj,...,An, we 
define their n-fold cartesian product to be the set 


Ay xX: xX A, t= {(@1,..+5@n) : a; € A; for each i =1,...,n}. 
B.3. Relations 


The mathematical structures we will deal with usually have more structure on it 
beyond the underlying set. For instance, we know that when we talk about the set 
R, we also want to be able to talk about the linear order < and the usual arithmetic 
binary functions + and -. If we didn’t have these notions available to us, then there 
wouldn’t be anything that special about the set R except that it’s a very very large 
set. The formal way to make things like this is through relations. 


Definition B.3.1. Given sets X and Y, we define a (binary) relation on X x Y 
(or a (binary) relation from X to Y) to be a subset RC Xx Y. If Risa 
relation on X x Y, then for an ordered pair (x,y) € X x Y we will often write 
xRy instead of (x,y) € R, and 
xRy instead of (x,y) ¢ R. 


(Note: «Ry is pronounced “a is related to y (by R)”; and «Ry is pronounced “x is 
not related to y (by R)”.) 


Remark B.3.2. The word binary in Definition [B.3.1] refers to the fact that Ris a 
relation on a cartesian product on two sets: X and Y. One can also define ternary 
relations on X x Y x Z and every n-ary relations on X1 x Xq xX -++ xX Xp. In this 
class we will (for the most part) restrict our attention to binary relations. 
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Example B.3.3. Consider X := {1,2,3,4} and Y := {a,b,c} and the binary 
relation R on X x Y given by: 


R= ies a), (a b), (2, a), (4, b), (4, c)} 
The relation R tells us, among other things, 1Ra but 3 Ry for every y € R. Since 


X, Y are small, we can picture all the relations specified by R with the following 
arrow diagram: 


FicurRE B.4. Arrow diagram from X to Y illustrating the relation 
RonxXxY 


B.4. Functions 


We are already familiar with functions f: X — Y as being some sort of machine 
that assigns to each input x € X a unique output y € Y. The formal way to view 
functions is as a special case of relations: 


Definition B.4.1. Suppose f is a relation on X x Y. We say that f is a function 
from X to Y (notation: f: X — Y) if for every x € X there is exactly one y € Y 
such that (x,y) € f, ie., 
(i) For each x € X, there exists y € Y such that (x,y) € f. 
(ii) For each x € X, and for every yi, y2 € Y, if (v,y1) € f and (x, y2) € f, then 
Y1 = Y2. 
Note: (i) asserts there is at least one y € Y, and (ii) asserts there is at most one 
y € Y. Taken together, (i) and (ii) assert there is exactly one y € Y (with the 
property (x,y) € f). 
Suppose f: X > Y. Then: 
(1) We shall use the notation f(x) = y to indicate that (x,y) € f. 
(2) The set X is called the domain of f (notation: domain(f) = X). 
(3) The set Y is called the codomain of f (notation: codomain(f) = Y). 
(4) The following subset of Y 


range(f) := {f(x):ceX} = {ye€Y : there exists  € X such that f(x) = y} 


is called the range of f. 

(5) We also may use the notation “cH f(x): X > Y” instead of f: X — Y, 
especially when the function f is determined by a formula in x and/or 
it is not necessary to give a name to the function; see Example [B.4.2{[2) 
below. 
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Example B.4.2. 
(1) Given a set X we define the identity function on X (notation: idx: X > 
X) to be the function that sends every x € X to itself, ice., 
idx(x) := x, for every re X. 


Note that in this case, domain(idx ) = codomain(idx ) = range(idx) = X. 
(2) The function 
kek: ZZ 
has domain Z, codomain Z and range {0,1,4,9,16,...}. 
(3) The function 
rHa?:ROR 
has domain R, codomain R and range {y € R: y > O}. 


Question B.4.3. What is the codomain of the following function: 
i 1, a), (2,c), (3, ), (4, b)} 


Answer B.4.4. Trick question! The domain is definitely the set X := {1,2,3,4}, 
however, the codomain can technically be any set which contains Y := {a,b,c}. 
Indeed, f is a valid function of type “X — Y” (in which case, the codomain would 
be Y), but it is also a valid function of type “X — Y U {d,e, f}” (in which case, 
the codomain would be Y U {d,e, f} = {a,b,c,d,e, f}). The lesson here is that the 
codomain is determined by what we say it is when we are specifying the function 
as either f: X > Y or f: X > Y U {d,e, f}. This annoyance only occurs for the 
codomain. The domain is always uniquely determined (as mentioned above) from 
the underlying set of ordered pairs, as is the range (which in this case is Y). 


Just as with relations, we can form a new function from two given functions by 
composition. 


Definition B.4.5. Suppose f: X — Y and g: Y > Z are functions. Then the 
composition of g with f is the function go f: X — Z defined by: 
(go f)() := g(f(x)) := the unique z € Z such that there is ay € Y 
such that f(x) = y and g(y) = z. 


Remark B.4.6. 


(1) Suppose we have three function f: X > Y,g:Y ~ Zandh: Z > W. 
Then we can create two new functions through composition: gof: X > Z 
and hog: Y 4 W. Finally, we can create two new functions: 


ho(gof):X—>W and (hog)of: X ~W. 
It is a nice exercise to show that these functions are the same, i.e., 
ho(gof) = (hog)of. 
Thus we say that functional composition is associative. 
(2) Functional composition allows us to highlight the two main properties of 
the identity function idx :X > X: 


(a) For every function f: X > Y we have foidx = f, 
(b) For every function g : W — X we have idx og = g. 


We can also (sometimes) consider the inverse of a function. 


150 


B.5. THREE SPECIAL TYPES OF FUNCTIONS 145 


Definition B.4.7. Suppose f: X — Y is a function. We say that a function 
g: Y + X is an inverse to f if 


fog=idy and gof=idx. 


We say that f: X — Y is an invertible function if there exists an inverse g: Y > 
XxX. 


At this point, it is not clear whether every function has an inverse (answer: no), 
or even in the cases when a function does have an inverse whether that inverse is 
unique (answer: yes). The following clears up the latter issue: 


Lemma B.4.8 (Uniqueness of function inverse). Suppose f: X > Y is a function 
and g,h: Y + X are inverses to f. Theng =h. 


PROOF. Note that 


g = goidy by Remark|B.4.6{2) 
go(foh)_ since A is an inverse of f 


(go f)oh since composition is associative 
= idxoh since g is an inverse of f 
= h_ by Remark|B.4.6{2). 


One special feature of the proof of Lemma[B.4.8]is that it used very general prin- 
ciples (compositional property of identity, definition of inverse, associativity) and 
did not mention specific elements x € X at all. Analogues of this argument show 
up in many other areas of math, for example, in the proof that the inverse of an 
invertible matrix is unique. At any rate, we can now unambiguously define the 
inverse f~! of an invertible function f: 


Definition B.4.9. Suppose f: X — Y is an invertible function. Then we define 
f-!: Y — X to be the (unique) inverse of f. 


B.5. Three Special Types of Functions 
There are three special flavors of functions which permeate all of mathematics: 


Definition B.5.1. A function f: X — Y is called 

(1) injective (or one-to-one) if for every 71,72 € X, if f(#1) = f (x2), then 
t= 7%. 

(2) surjective (tacitly: surjective onto Y) (or onto) if for every y € Y 
there exists an « € X such that f(x) = y. Equivalently, f is surjective if 
range(f) = codomain(f). 

(3) bijective (or a bijection, or one-to-one and onto) if f is both injective 
and surjective 


Note that the notion of surjective (as well as bijective) only makes sense when it is 
clear what the codomain is. If you change what the codomain is, the function might 
change whether it is surjective or not. For instance, in Question |B.4.3] the function 
f: X — Y is surjective, but the function f: X > Y U {d,e, f} is not surjective, 
even though the two f’s have the same underlying set! 


We give some simple examples of functions which either have or do not have each 
of these properties: 
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Example B.5.2. 


(1) Suppose X = {a,b,c} and Y = {d,e, f}. Then the function f: X > Y 
specified in Figure[B.5]is a bijection, i-e., it is both injective and surjective. 


we 


FIGURE B.5. A bijective (i.e., an injective and surjective) function 


(2) Suppose X = {a,b,c} and Y = {d,e}. Then the function f: X > Y 
specified in Figure [B.6]is a surjective function but it is not injective. 


FicurE B.6. A surjective function that is not bijective 


(3) Suppose X = {a,b} and Y = {c,d,e}. Then the function f: X > Y 
specified in Figure [B.7]is an injective function but it is not surjective. 


FiIGuRE B.7. An injective function that is not surjective 


(4) Suppose X = {a,b} and Y = {c,d}. Then the function f: X > Y 
specified in Figure [B.8]is neither injective nor surjective. 
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FiGurE B.8. A function that is neither injective nor surjective 


These notions allow us to characterize which functions are invertible: 


Theorem B.5.3. Suppose f: X + Y is a function. The following are equivalent: 
(1) f is a bijection. 
(2) f is invertible. 


PrRoor. Exercise! 
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